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Abstract 

In  layered  networks,  a  single  failure  at  the  lower  (physical)  layer  may  cause  multiple 
failures  at  the  upper  (logical)  layer.  As  a  result,  traditional  schemes  that  protect 
against  single  failures  may  not  be  effective  in  layered  networks.  This  thesis  studies 
the  problem  of  maximizing  network  survivability  in  the  layered  setting,  with  a  focus 
on  optimizing  the  embedding  of  the  logical  network  onto  the  physical  network. 

In  the  first  part  of  the  thesis,  we  start  with  an  investigation  of  the  fundamental 
properties  of  layered  networks,  and  show  that  basic  network  connectivity  structures, 
such  as  cuts,  paths  and  spanning  trees,  exhibit  fundamentally  different  characteristics 
from  their  single-layer  counterparts.  This  leads  to  our  development  of  a  new  cross¬ 
layer  survivability  metric  that  properly  quantifies  the  resilience  of  the  layered  network 
against  physical  failures.  Using  this  new  metric,  we  design  algorithms  to  embed 
the  logical  network  onto  the  physical  network  based  on  multi-commodity  flows,  to 
maximize  the  cross-layer  survivability. 

In  the  second  part  of  the  thesis,  we  extend  our  model  to  a  random  failure  setting 
and  study  the  cross-layer  reliability  of  the  networks,  defined  to  be  the  probability 
that  the  upper  layer  network  stays  connected  under  the  random  failure  events.  We 
generalize  the  classical  polynomial  expression  for  network  reliability  to  the  layered 
setting.  Using  Monte-Carlo  techniques,  we  develop  efficient  algorithms  to  compute 
an  approximate  polynomial  expression  for  reliability,  as  a  function  of  the  link  fail¬ 
ure  probability.  The  construction  of  the  polynomial  eliminates  the  need  to  resample 
when  the  cross-layer  reliability  under  different  link  failure  probabilities  is  assessed. 
Furthermore,  the  polynomial  expression  provides  important  insight  into  the  connec¬ 
tion  between  the  link  failure  probability,  the  cross-layer  reliability  and  the  structure 
of  a  layered  network.  We  show  that  in  general  the  optimal  embedding  depends  on  the 
link  failure  probability,  and  characterize  the  properties  of  embeddings  that  maximize 
the  reliability  under  different  failure  probability  regimes.  Based  on  these  results,  we 
propose  new  iterative  approaches  to  improve  the  reliability  of  the  layered  networks. 
We  demonstrate  via  extensive  simulations  that  these  new  approaches  result  in  em¬ 
beddings  with  significantly  higher  reliability  than  existing  algorithms. 
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Chapter  1 


Introduction 


Layering  is  a  fundamental  concept  in  modern  network  design.  It  describes  the  de¬ 
composition  of  the  network’s  functions  into  separate  logical  components.  The  way 
the  functions  are  divided,  as  well  as  the  interactions  among  these  logical  compo¬ 
nents,  define  the  network  architecture.  In  modern  communication  networks,  these 
components,  called  layers ,  are  often  organized  as  a  stack,  where  each  layer  relies  on 
the  services  provided  by  the  layer  below  to  implement  the  services  used  by  the  layer 
above.  Common  network  models  based  on  stacked  layering  include  the  OSI  Reference 
Model  [127]  and  the  TCP/IP  model  [23].  The  decomposition  of  network  function¬ 
alities  allows  each  layer  to  hide  much  of  its  internal  complexity  and  provide  a  clean 
interface  to  the  client  of  its  services.  For  instance,  in  the  OSI  Reference  Model,  the 
physical  layer  is  responsible  for  providing  a  “pipe”  with  a  certain  amount  of  bandwidth 
to  the  layer  above.  The  actual  physical  medium  that  implements  the  pipe,  however, 
is  opaque  to  the  upper  layer.  Similarly,  the  data  link  layer  is  responsible  for  framing, 
multiplexing  and  demultiplexing  data  that  is  sent  over  the  physical  layer.  It  defines 
the  protocol  for  reliable  data  transmission  over  the  physical  link.  This  transforms 
the  raw  bandwidth  provided  by  the  physical  layer  into  channels  that  allow  the  upper 
layer  to  reliably  access  and  share  the  physical  bandwidth.  Such  a  layering  approach 
greatly  simplifies  the  network  design  and  makes  it  possible  to  implement  and  operate 
the  network  in  a  modularized  and  evolvable  manner. 

A  pertinent  example  of  a  multi-layer  network  is  the  IP-over- WDM  network,  as 
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shown  in  Figure  1-1.  At  the  lower  layer  is  a  Wavelength  Division  Multiplexing  (WDM) 
network  which  consists  of  the  optical  switches  connected  by  the  physical  fibers.  On 
top  of  the  WDM  network  is  an  IP  network  where  the  IP  routers  are  connected  using 
(WDM)  lightpaths.  Each  lightpath  is  realized  by  setting  up  a  physical  connection 
using  one  of  the  wavelength  channels  in  the  optical  fibers.  In  this  IP-over-WDM 
architecture,  the  network  topology  in  the  upper  layer,  called  the  logical  topology,  is 
defined  by  the  set  of  IP  routers  and  the  lightpaths  connecting  them.  On  the  other 
hand,  the  physical  topology  is  defined  by  the  (possibly  different)  set  of  optical  switches 
and  the  fibers  connecting  them.  In  this  thesis,  we  will  discuss  our  results  in  the 
context  of  IP-over-WDM  networks;  as  such,  we  will  use  the  terms  ‘logical  links”  and 
“lightpaths”  interchangeably.  However,  the  concepts  discussed  are  equally  applicable 
to  other  layered  architectures,  such  as  IP  over  ATM,  AIM  over  SONET,  etc. 


Figure  1-1:  An  IP-over-WDM  network  where  the  IP  routers  are  connected  using  optical  lightpaths. 
The  logical  links  (arrowed  lines  on  top)  are  formed  using  lightpaths  (arrowed  lines  at  the  bottom) 
that  are  routed  on  the  physical  fiber  (thick  gray  lines  at  the  bottom).  In  general,  the  logical  and 
physical  topologies  are  not  the  same. 


In  multi-layer  networks,  the  design  of  the  logical  topology  is  often  decoupled  from 
the  physical  topology.  For  example,  it  is  very  possible  that  two  logical  nodes  that 
are  connected  by  a  logical  link  are  not  directly  connected  by  a  physical  fiber.  In  this 
case,  the  logical  link  can  be  created  by  setting  up  a  lightpath  that  traverses  multiple 
physical  hops.  This,  however,  involves  selecting  the  physical  route  taken  by  the 
lightpath.  The  choice  of  physical  routes  taken  by  the  lightpaths  in  the  logical  topology, 
called  the  lightpath  routing,  has  significant  implication  on  capacity  requirement  and 
network  survivability.  As  an  illustrative  example,  Figures  l-2(a)  and  l-2(b)  show 
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the  physical  and  logical  topologies  of  a  layered  network,  and  Figures  l-2(c)  and  1- 
2(d)  show  two  different  lightpath  routings.  In  Figure  l-2(c),  the  two  logical  links 
between  5  and  t,  are  routed  over  the  same  physical  path.  From  a  capacity  standpoint, 
this  means  that  the  physical  fiber  must  have  the  capacity  to  support  two  lightpaths 
within  the  same  fiber.  From  a  survivability  standpoint,  this  means  a  single  fiber  cut 
can  cause  both  of  the  logical  links  to  fail  simultaneously,  thereby  disconnecting  the 
logical  nodes  s  and  t.  As  a  result,  the  logical  network  is  susceptible  to  a  single  physical 
failure.  In  contrast,  in  Figure  l-2(d),  the  logical  links  are  routed  disjointly  over  the 
physical  network.  In  this  case,  the  physical  fibers  only  need  the  capacity  to  support 

one  wavelength  channel,  and  any  single  fiber  cut  will  only  result  in  failure  of  at  most 
one  logical  link. 


Figure  1-2:  Routing  logical  links  differently  can  affect  capacity  requirement  and  survivability. 


Therefore,  by  routing  the  lightpaths  intelligently  over  the  physical  network,  one 
can  increase  utilization,  as  well  as  improve  survivability  of  the  network.  While  the 
impact  on  the  utilization  has  been  quite  extensively  studied  [3,14,15,52,69,85  95 
115, 126],  the  survivability  aspect  is  relatively  unexplored.  The  main  focus  of  this 
thesis  is  to  develop  a  deeper  understanding  on  how  multi-layer  survivability  can  be 


19 


achieved  by  a  good  lightpath  routing.  We  will  consider  the  following  model  for  a 
two-layer  network: 

•  A  physical  topology  at  the  lower  layer,  modelled  by  a  network  graph  Gp  = 

(Vp,  Ep); 

•  A  logical  topology  at  the  upper  layer,  modelled  by  a  separate  network  graph 

0L  -  where  yLC  I/p; 

•  A  lightpath  routing ,  which  maps  each  logical  link  (s,  t)  €  EL  to  a  physical  (s,  t)- 
path  in  Gp. 

Associated  with  the  layered  network  is  a  survivability  measure  y,  which  maps 
the  lightpath  routing  to  a  non-negative  real  number  that  quantifies  its  survivability 
performance.  Throughout  the  thesis,  we  will  consider  different  definitions  for  y,  and 
study  two  classes  of  problems: 

1.  Survivability  Measurement:  Given  the  physical  and  logical  topologies,  as 
well  as  the  lightpath  routing  R  as  input,  compute  y(R). 

2.  Survivable  Lightpath  Routing:  Given  the  physical  and  logical  topologies, 
find  the  lightpath  routing  R  that  maximizes  y(R). 

In  the  rest  of  this  section,  we  will  provide  background  on  network  survivability 
in  Section  1.1,  and  discuss  existing  works  in  cross-layer  survivability  in  Section  1.2. 
Then  in  Section  1.3,  we  will  present  an  outline  of  the  thesis  and  highlight  our  major 
contributions. 


1.1  Background  on  Network  Survivability 

The  two  main  approaches  to  providing  network  survivability  are  protection  and  restora¬ 
tion.  Protection  refers  to  rapid  and  preplanned  recovery  mechanisms  where  in  the 
event  of  a  failure,  traffic  is  switched  over  to  back-up  paths.  On  the  other  hand, 
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restoration  refers  to  recovery  mechanisms  whereby  back-up  paths  are  found  dynam¬ 
ically  in  the  event  of  a  failure  [50].  Network  survivability  at  a  single  layer  has  been 
studied  extensively  and  the  literature  on  protection  and  restoration  is  extremely 
rich  [5,  30,  38,  44,  49-51,  58,  68,  71, 78,  88,  92,  93, 104, 109, 117,  118].  Here  we  provide 
a  brief  overview  of  protection  and  restoration  in  single  layer  networks;  highlighting 
the  issues  that  are  key  to  this  thesis. 

Protection  can  be  provided  at  the  various  layers  ]45, 51,99].  Protection  mecha¬ 
nisms  are  classified  into  link  protection  and  path  protection.  Link  protection  recovers 
from  a  link  failure  by  rerouting  the  traffic  around  the  failed  link  (e.g.,  using  loop- 
back  protection  [30,44,88,92,93]).  In  contrast,  path  protection  reroutes  traffic  using 
a  back-up  end-to-end  path  for  each  traffic  stream  [58,  78,  92,  93, 104].  For  example, 
SONET  rings  employ  either  link-based  or  path-based  protection  switching  [50,51], 
to  guarantee  recovery  within  60ms.  For  path  protection,  SONET  reserves  primary 
and  back-up  paths  in  opposite  directions  around  the  ring;  while  link  protection  is 
accomplished  by  rerouting  the  traffic  around  the  ring  from  the  one  end  of  the  failed 
link  to  the  other  [49,117].  Similarly,  both  path  and  link  protection  can  be  employed 
in  general  mesh  network  topologies  (e.g.,  ATM,  WDM,  etc.).  Path  protection  is  ac¬ 
complished  by  establishing  disjoint  primary  and  back-up  paths  from  the  source  to  the 
destination;  where  the  two  paths  must  be  disjoint  to  ensure  that  they  do  not  fail  si¬ 
multaneously  [92,93].  Link  protection  in  mesh  networks  can  be  accomplished  through 
the  use  of  protection  cycles  that,  provide  a  path  from  the  source  to  the  destination  of 
the  failed  link  [38,104]. 

In  contrast,  restoration  does  not  involve  preplanning  of  back-up  paths,  and  is  typi¬ 
cally  provided  at  the  electronic  (or  logical)  layers.  The  simplest  example  of  restoration 
is  that  of  packet  traffic  in  the  Internet  where  the  Internet  Protocol  (IP)  automati¬ 
cally  recovers  from  link  failures  by  rerouting  packets,  using  its  standard  routing  al¬ 
gorithms  (e.g.,  OSPF,  etc.)  [57,68,71].  Restoration  can  also  be  done  for  connection 
traffic,  on  an  end-to-end  basis;  where  after  a  failure,  a  new  path  is  established  dy¬ 
namically  [5,93,109].  However,  since  restoration  does  not  utilize  preplanned  back-up 
paths,  it  typically  takes  longer  to  recover  from  failures.  Moreover,  failure  recovery 
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is  not  guaranteed  as  a  back-up  path  may  not  exist  or  back-up  capacity  may  not  be 
available. 

Different  network  technologies  use  either  protection  or  restoration  for  failure  recov¬ 
ery,  and  the  choice  is  driven  by  the  service  being  provided.  The  distinction  between 
protection  and  restoration  is  important  because  they  each  impose  different  require¬ 
ment  on  the  network  design.  For  example,  protection  is  typically  done  using  disjoint 
primary  and  back-up  paths.  Hence  network  topologies  must  be  able  to  easily  ac¬ 
commodate  disjoint  paths.  For  this  reason  SONET  uses  a  ring  architecture  where 
disjoint  paths  can  be  easily  established  around  the  ring.  In  contrast,  restoration 
reroutes  traffic  by  finding  an  alternative  path  after  the  failure.  This  imposes  a  some¬ 
what  less  stringent  requirement  in  that  the  network  merely  has  to  remain  connected 
in  order  to  reroute  traffic,  subject  to  sufficient  capacity. 

Typically,  protection  or  restoration  is  provided  at  the  electronic  (logical)  layer, 
because  it  is  needed  to  recover  from  electronic  layer  failures  (e.g.,  line  card  failure). 
Although  physical  layer  protection  is  also  possible,  it  is  often  very  costly  in  terms 
of  additional  protection  capacity  and  is  often  incompatible  with  the  electronic  layer 
protection  mechanism  (e.g.,  SONET  protection  switching  is  initiated  within  a  few 
milliseconds;  not  nearly  enough  time  for  optical  layer  protection  to  take  effect)  [50, 
51].  Moreover,  since  the  electronic  layer  typically  offers  protection  or  restoration 
mechanisms,  protection  at  the  physical  layer  is  often  redundant  [57].  Hence,  in  this 
thesis  we  focus  on  network  architectures  where  the  protection  and/or  restoration  is 
provided  at  the  electronic  layer  only. 


1.2  Previous  Work  on  Cross-Layer  Survivability 

While  protection  and  restoration  have  been  extensively  studied  in  single-layer  net¬ 
works,  their  applicability  to  cross-layer  networks  is  not  well  understood.  For  example, 
protection  mechanisms  rely  on  finding  disjoint  paths  in  the  network,  a  well  under¬ 
stood  problem  in  single-layer  graphs.  However,  in  multi-layer  networks,  once  the 
logical  topology  is  embedded  on  the  physical  topology,  a  physical  fiber  link  may 
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carry  multiple  logical  links.  Therefore,  disjoint  paths  at  the  logical  layer  may  not 
be  disjoint  at  the  physical  layer,  rendering  the  logical  layer  protection  ineffective. 
Similarly,  restoration  mechanisms  require  the  network  to  remain  connected  after  a 
failure.  While  connectivity  in  single-layer  graph  is  well  understood,  in  a  multi-layer 
network,  a  physical  layer  failure  can  lead  to  multiple  logical  link  failures,  which  makes 
it  possible  to  disconnect  the  logical  network  even  if  the  logical  topology  is  designed 
to  have  high  connectivity. 

Cross-layer  survivability  has  received  relatively  limited  attention  in  the  litera¬ 
ture.  Most  previous  works  on  cross-layer  survivability  have  been  in  the  context  of 
WDM-based  networks  and  consider  very  specific  objectives,  such  as  routing  light- 
paths  to  survive  single  link  failures  in  optical  networks  or  finding  disjoint  paths  that 
do  not  share  a  common  network  failure,  generally  called  a  Shared  Risk  Link  Group 
(SRLG)  [8, 18, 19, 28, 35, 37, 56, 72, 84, 91, 100, 103, 105, 113, 120-122, 125j. 

The  impact  of  physical  layer  failures  on  the  connectivity  of  the  logical  topology 
was  first  studied  by  Crochat  et  al.  (6, 33, 34]  in  the  context  of  WDM-based  networks. 
The  authors  proposed  heuristic  algorithms  for  routing  the  lightpaths  that  consti¬ 
tute  the  logical  topology,  on  the  physical  topology,  so  as  to  minimize  the  number 
of  disconnected  node  pairs  on  the  logical  topology  in  an  event  of  single  physical  link 
failure.  Modiano  and  Narula-Tam  [76]  first  introduced  the  notion  of  Survivable  Light- 
path  Routing,  which  is  defined  to  be  a  routing  of  the  logical  links  over  the  physical 
topology  so  that  the  logical  topology  remains  connected  in  the  event  of  a  single  fiber 
failure.  The  same  paper  developed  mathematical  conditions  for  routing  lightpath  on 
the  physical  topology  so  that  the  logical  topology  remains  connected  even  if  one  of  the 
fibers  fails  and  formulated  the  problem  as  an  Integer  Linear  Program  (ILP).  In  [36], 
Deng,  Sasaki  and  Su  developed  a  Mixed  Integer  Linear  Program  (MILP)  for  the  sur¬ 
vivable  routing  problem  with  polynomial  number  of  constraints.  Todimala  et  al.  [113] 
generalized  the  problem  definition  to  cover  single  SRLG  failures,  and  developed  an 
ILP  as  well  as  heuristic  algorithms.  The  problem  of  routing  logical  rings  survivably 
on  the  physical  network  was  studied  in  [76,81,101,102].  In  particular,  [81]  considered 
the  physical  network  design  problem  and  proposed  several  special  physical  topologies 


23 


that  guarantee  the  existence  of  survivable  lightpath  routings  for  logical  rings.  In  [67], 
Kurant  et  al.  introduced  the  notion  of  piecewise  survivable  mapping  and  developed 
an  algorithm  to  compute  survivable  lightpath  routings  based  on  piecewise  survivable 
components.  The  same  technique  was  extended  to  compute  lightpath  routings  that 
are  survivable  against  k  failures,  for  a  fixed  value  of  k  [66].  In  [112],  Thulasiraman  et 
al.  introduced  the  idea  of  adding  protection  edges  to  the  logical  topology  in  the  case 
where  survivable  lightpath  routing  cannot  be  found  by  the  Kurant’s  algorithm.  Based 
on  this  idea,  the  authors  enhanced  Kurant’s  algorithm  to  always  return  a  survivable 
lightpath  routing,  at  the  expense  of  the  extra  protection  edges. 

The  related  issue  of  SRLG  failures  was  introduced  in  the  Generalized  Multi- 
Protocol  Label  Switching  (GMPLS)  standard  in  the  IETF  for  failure  management  [28, 
91,100].  A  SRLG  is  a  group  of  lightpaths  that  fail  simultaneously  upon  a  single  phys¬ 
ical  failure.  For  example,  for  a  particular  optical  fiber,  all  the  lightpaths  that  traverse 
the  same  fiber  form  a  SRLG.  Thus,  in  order  to  provide  rapid  protection,  two  SRLG- 
disjoint  paths,  i.e.,  paths  that  do  not  share  a  common  SRLG,  must  be  used.  This 
SRLG-Disjoint  Path  Problem  (SDPP)  was  first  studied  in  [18]  and  subsequently  in 
the  book  written  by  the  same  author  [19].  In  [56]  the  problem  was  shown  to  be  NP- 
complete;  and  heuristic  algorithms  for  different  variations  of  the  SDPP  problem  were 
proposed  in  [8,72,84, 103, 120- 122].  Various  aspects  of  network  design  under  SRLG 
constraints  were  also  studied  in  [35,37,105,113,125]. 


1.3  Contributions 

1.3.1  Theoretical  Underpinnings  of  Cross-Layer  Survivability 
Problems 

As  discussed  in  the  previous  section,  all  existing  works  in  cross-layer  survivability 
consider  very  specific  objectives  and  the  primary  focus  is  to  design  algorithms  for 
these  problems.  This  thesis  attempts  to  develop  a  more  rigorous  treatment  of  cross¬ 
layer  survivability  in  order  to  provide  the  foundation  for  quantifying  and  optimizing 
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survivability  in  layered  networks.  We  will  start  with  the  questions  of  why,  and  to  what 
extent ,  existing  protection  and  restoration  mechanisms  do  not  work  in  the  multi-layer 
setting.  Section  2.2  offers  answers  to  these  questions  by  exposing  the  structural  dif¬ 
ferences  between  single-layer  and  multi-layer  networks.  More  specifically,  we  propose 
a  model  for  multi-layer  networks  that  generalizes  the  classical  network  graph  model 
for  single-layer  networks.  We  will  show  that  connectivity  structures  in  this  general¬ 
ized  setting,  such  as  paths,  cuts,  and  spanning  trees,  exhibit  fundamentally  different 
properties  from  their  single-layer  counterparts;  as  such,  special  graph  properties  that 
constitute  the  foundation  of  single-layer  survivability,  such  as  the  max-flow  min-eut 
relationship,  do  not  carry  over  to  multi-layer  networks.  In  addition,  we  prove  several 
results  that  reveal  the  new  max-flow  min-cut  relationship  in  multi-layered  networks, 
as  well  as  NP-Hardness  for  computing  various  basic  graph  structures  in  the  multi¬ 
layer  setting,  such  as  maximum  disjoint  paths,  minimum  cuts  and  minimum  spanning 
trees.  This  collection  of  results  suggest  a  fundamental  structural  difference  between 
single-layer  and  multi-layer  networks,  which  has  the  following  profound  implications: 


1.  Protection  and  restoration  mechanisms  designed  for  single-layer  networks  may 
not  be  effective  in  the  multi-layer  setting. 

2.  Common  metrics,  such  as  connectivity,  that  are  used  to  quantify  survivability 
for  single-layer  networks  lose  much  of  their  meanings  if  applied  blindly  to  multi¬ 
layer  networks. 

3.  Existing  algorithms  for  assessing  and  maximizing  survivability  for  single-layer 
networks  are  not  easily  extendable  to  the  multi-layer  setting,  due  to  the  funda¬ 
mental  differences  between  the  two  types  of  networks  and  the  inherent  hardness 
of  computing  multi-layer  connectivity  structures. 
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1.3.2  Metrics  and  Algorithms  for  Survivable  Layered  Network 
Design 

The  observations  from  Section  2.2  motivate  us  to  reinvestigate  basic  issues  in  surviv¬ 
ability  for  multi-layer  networks,  starting  with  the  definition  of  cross-layer  survivabil¬ 
ity.  In  order  to  understand  the  survivability  performance  of  a  multi-layer  network 
design,  it  is  important  to  define  metrics  that  properly  capture  multi-layer  survivabil¬ 
ity.  Unfortunately,  due  to  the  inherent  complexity  of  cross-layer  structures,  defining  a 
meaningful  cross-layer  survivability  metric  is  non-trivial.  Therefore,  in  Section  2.3  we 
propose  guidelines  for  cross-layer  survivability  metric  design,  defining  several  prop¬ 
erties  that  a  metric  must  satisfy  in  order  to  be  a  suitable  cross-layer  survivability 
metric.  Based  on  these  guidelines,  we  define  two  cross-layer  survivability  metrics, 
called  Min  Cross  Layer  Cut  and  Min  Weighted  Load  Factor.  We  will  explain  their 
physical  meanings  and  discuss  how  these  metrics  can  be  computed.  We  will  also 
investigate  their  mathematical  properties,  which  reveal  certain  inherent  connections 
between  the  metrics  and  provide  insight  into  our  development  of  ILP  formulations 
for  the  Survivable  Lightpath  Routing  problem. 

In  Section  2.4  we  will  formulate  the  Survivable  Lightpath  Routing  problem  as 
a  survivability  maximization  problem,  using  Min  Cross  Layer  Cut  (MCLC)  as  the 
optimization  objective.  Due  to  the  inherent  difficulty  in  maximizing  the  metric  di¬ 
rectly,  in  Section  2.4  we  consider  ILP  approximations  for  the  MCLC  maximization 
problem.  We  run  extensive  simulations  comparing  the  survivability  performance  of 
these  formulations  with  the  existing  Survivable  Lightpath  Routing  algorithm  in  the 
literature.  The  results  show  that  our  approach  to  maximize  an  approximation  of 
the  MCLC  can  often  lead  to  lightpath  routings  with  significantly  better  survivabil¬ 
ity  performance  than  existing  algorithms.  In  addition,  our  simulation  results  also 
suggest  that  a  formulation  that  closely  approximates  the  MCLC  maximization,  com¬ 
bined  with  the  randomized  rounding  technique,  provides  an  efficient  way  to  design 
multi-layer  networks  with  good  survivability  performance. 


26 


1.3.3  Extension  to  Random  Physical  Failures 


In  the  second  part  of  the  thesis,  we  will  extend  our  investigation  to  the  random 
physical  failure  model,  where  all  physical  links  are  assumed  to  fail  independently 
with  certain  probability.  Similar  to  the  deterministic  model,  a  physical  link  failure 
will  affect  all  the  logical  links  that  use  that  physical  link.  The  metric  of  interest 
under  this  model  is  the  cross-layer  reliability ,  which  is  the  probability  that  the  logical 
topology  stays  connected  under  the  random  physical  failures. 

Computing  reliability  was  shown  to  be  #P-complete  in  single  layer  networks  [114], 
and  even  approximating  the  reliability  to  within  a  constant  factor  cannot  be  done 
in  polynomial  time  [87].  Although  there  are  works  aimed  at  exact  computation  of 
reliability  through  graph  transformation  and  reduction  [27,73,83,86,98,106,107,111], 
the  applications  of  such  methods  are  limited  to  specific  topologies.  Because  of  the 
difficulty  in  assessing  network  reliability,  most  previous  works  in  this  context  focused 
on  estimating  the  network  reliability,  either  by  deterministic  “best-effort”  approaches 
without  accuracy  guarantee  [24,31,53,89,94],  or  by  Monte  Carlo  simulations  [41,62, 
63, 82[  with  probabilistic  accuracy  guarantee. 

Although  there  has  been  a  large  body  of  works  on  estimating  single-layer  net¬ 
work  reliability,  cross-layer  reliability  has  not  been  explored  previously.  Our  main 
contributions  in  this  area  are  new  algorithms  for  cross-layer  reliability  estimation  and 
maximization,  as  well  as  theoretical  results  that  lead  to  a  deeper  understanding  of 
structures  in  layered  networks  that  contribute  to  high  reliability.  In  Chapter  3,  we 
develop  an  algorithm  that  yields  a  polynomial  expression  [12]  for  the  reliability  of  a 
given  multi-layer  network.  This  expression  provides  a  formula  for  cross-layer  reliabil¬ 
ity  as  a  function  of  the  physical  link  failure  probability.  In  contrast  to  many  existing 
reliability  estimation  methods  for  single-layer  networks  [41,62,63],  our  method  is  not 
tailored  to  a  particular  probability  of  link  failure,  and  consequently,  it  does  not  re¬ 
quire  resampling  in  order  to  estimate  reliability  under  different  values  of  link  failure 
probability.  That  is,  once  the  polynomial  is  estimated,  it  can  be  used  for  any  value 
of  link  failure  probability  without  resampling. 
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The  polynomial  expression  given  by  the  algorithm  also  reveals  important  struc¬ 
tural  information  of  the  underlying  layered  network,  which  provides  clear  insights 
into  how  lightpath  routing  should  be  designed  for  better  reliability.  In  Chapter  4, 
we  investigate  the  relationship  between  the  link  failure  probability,  the  cross-layer 
reliability  and  the  structure  of  a  layered  network.  We  show  that  the  structures  of  the 
optimal  lightpath  routings  depend  on  the  link  failure  probability.  In  particular,  light- 
path  routings  that  are  optimal  in  the  regime  where  the  link  failure  probability  is  low, 
is  structurally  different  from  lightpath  routings  that  are  optimal  in  the  regime  where 
the  link  failure  probability  is  high.  The  investigation  culminates  in  characterizations 
of  optimal  lightpath  routings  in  the  two  probability  regimes.  These  characterizations 
reveal  the  criteria  for  maximizing  the  cross-layer  reliability  of  lightpath  routings  under 
the  respective  probability  regimes,  which  provides  important  insights  into  developing 
survivable  lightpath  routing  algorithms  to  maximize  cross-layer  reliability. 

Based  on  the  insights  developed  in  Chapter  4,  Chapter  5  explores  different  meth¬ 
ods  for  maximizing  cross-layer  reliability  of  a  given  lightpath  routing  in  the  low  prob¬ 
ability  regime.  Specifically,  we  study  two  different  approaches  to  improve  the  relia¬ 
bility  of  a  layered  network.  The  first  approach  is  lightpath  rerouting ,  which  involves 
incrementally  choosing  a  new  physical  route  for  an  existing  lightpath,  so  that  the 
cross-layer  reliability  can  be  improved  by  such  a  reroute.  The  second  approach  is 
logical  topology  augmentation ,  where  a  new  lightpath  is  added  to  the  logical  topology 
to  improve  reliability.  For  each  approach,  we  formulate  the  reliability  improvement 
achieved  by  a  rerouting/augmentation  step,  and  develop  algorithms  to  maximize  the 
reliability  improvement.  By  iteratively  applying  the  algorithm,  one  can  incrementally 
improve  the  reliability  of  the  network  until  no  further  local  improvement  is  possible. 
This  gives  us  effective  ways  to  generate  lightpath  routings  with  better  reliability  than 
all  lightpath  routing  algorithms  previously  considered.  Finally,  in  Section  5.3,  we 
carry  out  a  case  study  on  a  real-world  IP-over-WDM  network,  and  apply  the  tech¬ 
niques  discussed  in  this  thesis  to  study  reliability  in  a  real-world  setting. 
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Chapter  2 


Fundamentals  of  Cross-Layer 
Survivability 

2 . 1  Introduction 

A  key  aspect  that  is  new  in  the  layered  network  setting  is  the  sharing  of  physical  fibers 
by  multiple  logical  links.  Because  of  this,  a  single  physical  failure  will  propagate  to  the 
logical  layer  and  cause  logical  links  to  fail  in  a  correlated  fashion.  This  correlation  is 
implicitly  determined  by  the  lightpath  routing,  and  this  phenomenon  fundamentally 
changes  the  connectivity  structures  of  a  network.  Algorithms  designed  to  effectively 
assess  or  enhance  survivability  of  a  multi-layer  network  must  therefore  take  into  ac¬ 
count  such  dependencies.  Most  existing  protection  and  restoration  mechanisms  for 
single-layer  networks  assume  uncorrelated  failures  in  the  network,  and  therefore  may 
no  longer  be  effective  in  this  multi-layer  setting. 

In  this  chapter,  we  will  develop  a  more  rigorous  treatment  of  fundamental  issues 
in  cross-layer  survivability.  In  Section  2.2,  we  will  first  study  basic  connectivity 
structures,  such  as  cuts,  paths  and  trees,  in  the  multi-layer  network  model,  and 
highlight  the  key  differences  from  their  single-layer  counterparts,  both  in  terms  of 
combinatorial  properties  and  computation  complexity.  As  a  result  of  this,  common 
survivability  metrics  such  as  the  connectivity  of  a  network  topology  lose  much  of  their 
meaning  in  multi-layer  networks.  These  findings  lead  us  to  propose  new  survivability 
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metrics  for  multi-layer  networks,  and  algorithms  to  improve  cross-layer  survivability 
based  on  these  new  metrics  in  Sections  2.3  and  2.4.  Simulation  results  for  these 
algorithms  will  be  presented  in  Section  2.5. 


2.2  Graphs  Structures  in  Multi-Layer  Networks 

In  this  section,  we  study  various  connectivity  structures  such  as  flows,  cuts,  trees  and 
paths  in  multi-layer  graphs  in  order  to  develop  insights  into  cross-layer  survivability. 
We  will  highlight  the  key  difference  in  combinatorial  properties  between  multi-layer 
graphs  and  single-layer  graphs.  In  particular,  we  will  show  that  fundamental  surviv¬ 
ability  results,  such  as  the  Max  Flow  Min  Cut  Theorem,  are  no  longer  applicable  to 
multi-layer  networks.  Consequently,  metrics  such  as  “connectivity”  have  significantly 
different  meanings  in  the  cross-layer  setting.  This  motivates  our  reinvestigation  in 
the  following  sections  of  fundamental  issues  such  as  quantifying  and  maximizing  sur¬ 
vivability  in  the  multi-layer  setting. 

2.2.1  Max  Flow  vs  Min  Cut 

For  single-layer  networks,  the  Max-Flow  Min-Cut  Theorem  [4]  states  that  the  max¬ 
imum  amount  of  flow  passing  from  the  source  s  to  the  sink  t  always  equals  the 
minimum  capacity  that  needs  to  be  removed  from  the  network  so  that  no  flow  can 
pass  from  s  to  t.  In  addition,  if  all  links  have  integral  capacity,  then  there  exists  an 
integral  maximum  flow.  This  implies  that  the  maximum  number  of  disjoint  paths 
between  s  and  t  is  the  same  as  the  minimum  cut  between  the  two  nodes.  Hence,  the 
term  connectivity  between  two  nodes  can  be  used  unambiguously  to  refer  to  different 
measures  such  as  maximum  number  of  disjoint  paths  or  minimum  cut,  and  this  makes 
it  a  natural  choice  as  the  standard  metric  for  measuring  network  survivability. 

Because  of  its  fundamental  importance,  we  would  like  to  investigate  the  Max-Flow 
Min-Cut  relationship  for  multi-layer  networks.  We  first  generalize  the  definitions  of 
Max  Flow  and  Min  Cut  for  layered  networks: 
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Definition  2.1  In  a  multi-layer  network,  the  Max  Flow  between  two  nodes  s  and  t 
in  the  logical  topology  is  the  maximum  number  of  physically  disjoint  s  —  t  paths  in  the 
logical  topology. 

Definition  2.2  In  a  multi-layer  network,  the  Min  Cut  between  two  nodes  s  and  t  in 
the  logical  topology  is  the  minimum  number  of  physical  links  that  need  to  be  removed 
in  order  to  disconnect  the  two  nodes  in  the  logical  topology. 

We  model  the  physical  topology  as  a  network  graph  GP  =  ( VF ,  Ep) ,  where  \  j>  and 
Ep  are  the  nodes  and  links  in  the  physical  topology.  The  logical  topology  is  modelled 
as  Gl  =  ( VL .  El),  where  Vp  C  Vp.  The  lightpath  routing  is  represented  by  a  set  of 
binary  variables  /V ,  where  a  logical  link  (s,t)  uses  physical  fiber  (i,j)  if  and  only  if 
ffj  =  1.  For  any  pair  of  logical  nodes  x  and  y,  let  Vxy  be  the  set  of  all  x  —  y  paths  in 
the  logical  topology.  For  each  path  p  6  Vxy,  let  L(p)  be  the  set  of  physical  links  used 
by  the  logical  path  p,  that  is,  L(p)  =  -hbj\( ,,  { ( /  .  J ) =  l}.  Then  the  Max  Flow 
and  Min  Cut  between  nodes  s  and  t  can  be  formulated  mathematically  as  follows: 

MaxFlowst  :  Maximize  fp,  subject  to: 

pevst 

Y,  //'  —  1  V(i;j)€Ep  (2.1) 

p:(i,])eL(p) 

fp  €  {0, 1}  \/PeVp. 

MinCutst  :  Minimize  Vij-  subject  to: 

(iJ)eE'p 

Y  >1  Vp  G  Vsl  (2.2) 

}eX  (p) 

Vij  V  {0, 1}  V( /  .  j)  G  Ep 

The  variable  fp  in  the  formulation  MaxFlowst  indicates  whether  the  path  p  is 
selected  for  the  set  of  (s.  t (-disjoint  paths.  Constraint  (2.1)  requires  that  no  selected 
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logical  paths  share  a  physical  link.  Similarly,  in  the  formulation  MinCutst,  the  variable 
Uij  indicates  whether  the  physical  fiber  (i,j)  is  selected  for  the  minimum  (s,t)~ cut. 
Constraint  (2.2)  requires  that  all  logical  paths  between  s  and  t  traverse  some  physical 
fiber  (i,j)  with  yi,  =  1. 

Note  that  the  above  formulations  generalize  the  the  Max  Flow  and  Min  Cut  for 
single-layer  networks.  In  particular,  the  formulations  model  the  classical  Max  Flow 
and  Min  Cut  of  a  graph  G  if  both  Gp  and  Gp  are  equal  to  G\  and  =  1  if  and  only 

if  (s,t)  =  (i  j). 

Let  MaxFlowst  and  MinCutst  be  the  optimal  values  of  the  above  Max  Flow  and  Min 
Cut  formulations.  We  also  denote  MaxFlowR  and  MinCutsRt  to  be  the  optimal  values  to 
the  linear  relaxations  of  above  Max  Flow  and  Min  Cut  formulations.  The  Max-Flow 
Min-Cut  Theorem  for  single-layer  networks  can  then  be  written  as  follows: 

MaxFlowst  —  MaxFlowR  =  MinCutR  =  MinCutst. 

The  equality  among  these  values  has  profound  implications  on  survivable  network 
design  for  single-layer  networks.  Because  all  these  survivability  measures  converge  to 
the  same  value,  it  can  naturally  be  used  as  the  standard  survivability  metric  that  is 
applicable  to  measuring  both  disjoint  paths  or  minimum  cut.  Another  consequence 
of  this  equality  is  that  linear  programs  (which  are  polynomial  time  solvable)  can  be 
used  to  find  the  minimum  cut  and  disjoint  paths  in  the  network. 

It  is  therefore  interesting  to  see  whether  the  same  relationship  holds  for  multi-layer 
networks.  First,  it  is  easy  to  verify  that  the  linear  relaxations  for  the  formulations 
MaxFlowst  and  MinCutst  maintain  a  primal-dual  relationship,  which,  by  Duality  Theo¬ 
rem  [17],  implies  that  MaxFlowR  -MinCutR.  In  addition,  since  any  feasible  solution  to 
an  integer  program  is  also  a  feasible  solution  to  the  linear  relaxation,  we  can  establish 
the  following  relationship: 

Observation  1  MaxFlowst  <  MaxFlowR  =  MinCutR  <  MinCutst. 

Therefore,  like  single-layer  networks,  the  maximum  number  of  disjoint  paths  be- 


32 


tween  two  nodes  cannot  exceed  the  minimum  cut  between  them  in  a  multi-layer 
network. 

However,  unlike  the  single-layer  case,  the  values  of  MaxFlowst,  MaxFIow^  and 
MinCutst  are  not  always  identical,  as  illustrated  in  the  following  example.  In  our 
examples  throughout  the  section,  we  use  a  logical  topology  with  two  nodes  s  and 
t  that  are  connected  by  multiple  lightpaths.  For  simplicity  of  exposition,  we  omit 
the  complete  lightpath  routing  and  only  show  the  physical  links  that  are  shared  by 
multiple  lightpaths.  Theorem  2.1  states  that  this  simplification  can  be  made  without 
loss  of  generality. 

Theorem  2.1  Let  Gl  be  a  logical  topology  with  two  nodes  s  and  t,  connected  by  n 
lightpaths  El  =  {el5  e2,  ■  ■ . ,  and  let  1Z  =  {f?i,  i?2,  •  -  • ,  Rk}  be  a  family  of  subsets 
of  El,  where  each  >  2.  that  captures  the  fiber-sharing  relationship  of  the  logical 
links.  There  exist  a  physical  topology  GP  —  (VP.  EP)  and  lightpath  routing  of  Gl  over 
Gp,  such  that: 

1.  there  are  exactly  k  fibers  in  EP,  denoted  by  F  =  { f\ .  f2, ... ,  f). } ,  that  are  used 
by  multiple  lightpaths; 

2.  for  each  fiber  fi  £  F,  the  set  of  lightpaths  using  fi  is  Rj  . 

Proof.  See  Appendix  2.7.1.  □ 

Theorem  2.1  implies  that  for  a  two-node  logical  topology,  any  arbritrary  fiber- 
sharing  relationship  1Z  can  be  realized  by  reconstructing  a  physical  topology  and 
lightpath  routing.  Therefore,  in  the  following  discussion,  we  can  simplify  our  examples 
by  only  giving  the  fiber-sharing  relationship  of  our  two-node  logical  topology  without 
showing  the  details  of  the  lightpath  routing. 

In  Figure  2-1,  the  two  nodes  in  the  logical  topology  are  connected  by  three  light¬ 
paths.  The  logical  topology  is  embedded  on  the  physical  topology  in  such  a  way 
that  each  pair  of  lightpaths  share  a  fiber.  It  is  easy  to  see  that  no  single  fiber  can 
disconnect  the  logical  topology,  and  that  any  pair  of  fibers  would.  Hence,  the  value 
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of  MinCutst  is  2  in  this  case.  On  the  other  hand,  the  value  of  MaxFlowst  is  only  1, 
because  any  two  logical  links  share  some  physical  fiber,  so  none  of  the  paths  in  the 
logical  network  are  physically  disjoint.  Finally,  the  value  of  MaxFlowsRt  is  1.5  because 
a  flow  of  0.5  can  be  routed  on  each  of  the  lightpaths  without  violating  the  capacity 
constraints  at  the  physical  layer.  Therefore,  all  three  quantities  are  different  in  this 
example.  We  will  study  the  integrality  gaps  for  the  formulations  more  carefully. 


Figure  2-1: 
topology. 


A  logical  topology  with  3  links  where  each  pair  of  links  shares  a  fiber  in  the  physical 


Integrality  Gap  for  MaxFlowst 

The  above  example  can  be  generalized  to  show  that  the  ratio  between  MaxFlowst 
and  MaxFlowR  is  0(n),  where  n  is  the  number  of  paths  between  s  and  t.  Consider  an 
instance  of  lightpath  routing  where  the  two  nodes  in  the  logical  network  are  connected 
by  n  logical  links,  and  every  pair  of  logical  links  share  a  separate  fiber.  In  this  case, 
the  value  of  MaxFlowst  will  be  1,  and  the  value  of  MaxFlowsRt  will  be  §,  using  the 
same  arguments  as  above.  Therefore,  the  ratio  is  0(n).  Note  that  this  is  an 

asymptotically  tight  bound  since  MaxFlowst  >  1  and  MaxFlowR  <  n  for  all  lightpath 
routings. 

Integrality  Gap  for  MinCutst 

The  ratio  between  MinCutst  and  MinCutR  can  be  shown  to  be  at  most  O(logn)  as 
a  diiect  application  of  the  result  by  Lovasz  [74],  who  showed  that  the  integrality 
gap  between  integral  and  fractional  set  cover  is  O(logrc).  We  can  construct  a  light¬ 
path  routing  where  the  gap  between  the  two  values  is  O(logn),  thereby  showing  the 
tightness  of  the  bound. 

Consider  a  layered  network  consisting  of  a  two-node  logical  topology,  and  a  set  of 
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k  fibers  F  =  {/i, . . . ,  /*,}  that  are  shared  by  multiple  logical  links.  For  every  subset 
T  of  [|J  +  1  fibers  in  F,  we  add  a  logical  link  between  the  two  logical  nodes  that  uses 
only  the  fibers  in  T.  Hence,  for  every  set  of  [|]  -  1  fibers,  there  is  a  logical  link  that 
does  not  use  any  of  the  fibers.  This  implies  the  Min  Cut  is  at  least 

On  the  other  hand,  since  each  logical  link  uses  exactly  [|J  +  1  fibers,  the  assign¬ 


ment  where  each  y,, 


iJ+1 


satisfies  Constraint  (2.2),  and  is  therefore  a  feasible 

,  which  is  at  most 


solution  to  MinCutst.  The  objective  value  of  this  solution  is 
2.  Therefore,  the  integrality  gap  is  at  least  |. 

Therefore,  for  the  two-node  logical  network  with  n  —  (^j+1)  logical  links,  the 
ratio  between  the  integral  and  relaxed  optimal  values  for  the  Min  Cut  is  O(k)  — 
O(logn).  We  summarize  our  observation  as  follows: 


Observation  2  In  a  layered  network,  the  values  o/MaxFlowst,  MaxFlow^t  and  MinCutst 
can  be  all  different.  In  addition,  the  gaps  among  the  three  values  are  not  bounded  by 
any  constant. 


Therefore,  a  multi-layer  network  with  high  connectivity  value  (i.e.  that  tolerates 
a  large  number  of  failures)  does  not  guarantee  existence  of  physically  disjoint  paths. 
This  is  in  sharp  contrast  to  single-layer  networks  where  the  number  of  disjoint  paths 
is  always  equal  to  the  minimum  cut. 

It  is  thus  clear  that  network  survivability  metrics  across  layers  are  not  trivial 
extensions  of  the  single  layer  metrics.  New  metrics  need  to  be  carefully  defined  in 
order  to  measure  cross-layer  survivability  in  a  meaningful  manner.  In  Section  2.3,  we 
will  specify  the  requirements  for  cross-layer  survivability  metrics,  and  propose  two 
new  metrics  that  can  be  used  to  measure  the  connectivity  of  multi-layer  networks. 


2.2.2  Minimum  Survivable  Path  Set 

In  this  section,  we  introduce  another  graph  structure,  called  Survivable  Path  Set, 
that  is  useful  in  describing  connectivity  in  layered  networks.  A  survivable  path  set 
for  two  logical  nodes  s  and  t  is  a  set  of  s  —  t  logical  paths  such  that  at  least  one  of  the 
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paths  in  the  set  survives  for  any  single  physical  link  failure.  The  Minimum  Survivable 
Path  Set,  denoted  as  MinSPSst,  is  the  size  of  the  smallest  survivable  path  set.  For 
convenience,  MinSPSst  is  defined  to  be  oo  if  no  survivable  path  set  exists. 

In  a  single  layer  network,  the  value  of  MinSPSst  reveals  nothing  more  than  the 
existence  of  disjoint  paths,  as  its  value  is  either  2  or  oo,  depending  on  whether  disjoint 
paths  between  s  and  t  exist.  However,  for  multi-layer  networks,  MinSPSst  can  be  any 
integer  between  2  and  oo1.  For  example,  in  Figure  2-1,  the  minimum  survivable  path 
set  for  s  and  t  has  size  three  because  any  pair  of  logical  links  can  be  disconnected  by 
a  single  fiber  failure.  In  fact,  it  is  easy  to  verify  that: 

•  MinSPSst  =  2  if  and  only  if  MaxFlowst  >  2; 


•  MinSPSst  =  oo  if  and  only  if  MinCutst  =  1. 


Therefore,  the  value  of  MinSPSst  provides  a  different  perspective  about  the  con¬ 
nectivity  between  two  nodes  in  the  cross-layer  setting.  It  is  particularly  interesting 
in  the  regime  where  MaxFlowst  —  1  and  MinCutst  >  2,  i.e.,  there  is  a  gap  between 
the  Max  Flow  and  the  Min  Cut.  The  following  theorem  reveals  a  connection  between 
survivable  path  sets  and  the  relaxed  Max  Flow  MaxFIow^. 


Theorem  2.2  MinSPSst  < 


log  TV  j 

lo£  MaxRowH 


+  1. 


Proof.  See  Appendix  2.7.2 


□ 


It  is  worth  noting  that  the  theorem  provides  a  sufficient  condition  for  the  existence 
of  disjoint  paths  in  the  layered  networks,  in  terms  of  the  optimal  value  of  MaxFIow^: 


Corollary  2.3  Disjoint  paths  between  two  nodes  s  and  t  exist  in  a  layered  network 
if  the  relaxed  Max  Flow.  MaxFIowj^,  is  greater  than  ^/\EP\. 

Proof.  By  Theorem  2.2,  a  survivable  path  set  of  size  two  exists  if  MaxFIow^  >  VW7l 

This  implies  the  existence  of  .s  —  t  disjoint  paths  in  the  layered  network.  □ 

1  An  instance  with  MinSPSst  =  k  can  be  easily  constructed  using  the  2-node,  fc-link  logical  topology 
similar  to  Figure  2-1,  in  which  every  set  of  k  —  1  logical  links  share  a  common  physical  fiber. 
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Therefore,  survivable  path  sets  not  only  are  interesting  graph  structures  that 
describe  connectivity  of  layered  networks,  they  can  also  be  useful  in  revealing  the 
relationship  between  integral  and  fractional  flows  in  the  layered  network. 

2.2.3  Spanning  Trees 

For  a  single-layer  graph  G  =  (V,  E),  a  spanning  tree  can  be  defined  as  a  minimal  set  of 
edges  in  h  that  keeps  all  nodes  in  V  connected.  Since  all  spanning  trees  of  the  graph 
have  the  same  number  of  edges,  constructing,  counting  and  sampling  spanning  trees 
in  a  single-layer  network  can  be  done  in  polynomial  time  [46, 47, 61 , 82, 96].  These  nice 
properties  about  spanning  trees  in  single-layer  networks  allow  construction  of  efficient 
algorithms  for  reliable  single-layer  networks  design  [39,82, 108]. 

For  multi-layer  networks,  however,  the  characteristics  of  spanning  trees  is  vastly 
different.  We  define  a  cross-layer  spanninq  tree  as  follows: 

Definition  2.3  In  a  multi-layer  network,  a  Cross-Layer  Spanning  Tree  is  a  minimal 
set  of  physical  fibers  whose  survival  will  keep  the  logical  topology  connected. 

Unlike  single-layer  networks,  the  number  of  edges  in  a  cross-layer  spanning  trees 
can  vary  significantly.  Consider  Figure  2-2,  which  shows  the  lightpath  routing  of  a 
two-node  logical  topology  over  the  physical  network  with  three  links.  In  the  example, 
{1,2}  and  {3}  are  two  minimal  sets  of  physical  links  that  keep  the  logical  topology 
connected.  Therefore,  not  all  cross-layer  spanning  trees  have  the  same  cardinality.  In 
fact,  the  example  can  be  easily  modified  such  that  one  of  the  logical  links  traverses 
an  arbitrary  number  of  physical  fibers.  This  means  that  cross-layer  spanning  trees  in 
a  multi-layer  network  can  have  significantly  different  sizes. 

. Muy... 

A . 

b  (L3) 

Figure  2-2:  {Li,L2}  and  {Z,3}  are  cross-layer  spanning  trees  with  different  cardinalities. 
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The  minimum  cross-layer  spanning  tree  of  a  layered  network,  defined  to  be  the 
cross-layer  spanning  tree  with  the  minimum  number  of  physical  fibers,  is  of  particular 
importance  for  cross-layer  survivability.  Intuitively,  this  is  the  minimum  number  of 
physical  fibers  that  need  to  survive  in  order  to  keep  the  logical  topology  connected. 
In  Chapter  4,  we  will  investigate  in  greater  details  the  role  of  minimum  cross-layer 
spanning  trees  in  cross-layer  survivability.  The  following  theorem  gives  a  lower  bound 
on  the  size  of  the  minimum  cross-layer  spanning  tree  in  a  network: 

Theorem  2.4  The  size  of  the  minimum  cross-layer  spanning  tree  is  at  least  | \'f  j  — 
where  VL  is  the  set  of  the  logical  nodes. 

Proof.  For  a  set  of  physical  links  S  to  be  a  cross-layer  spanning  tree,  all  nodes  in  Vl 
must  be  connected  in  the  underlying  physical  subgraph  induced  by  S.  For  5  to  span 
a  set  of  | VL |  nodes,  it  must  contains  at  least  \  VL\  -  1  edges.  □ 

2.2.4  Computational  Complexity 

The  structures  discussed  in  the  previous  sections  are  basic  building  blocks  for  many 
survivability  algorithms  for  single  layer  networks  [4,39,43,62,82,108].  These  algo¬ 
rithms  are  effective  for  single-layer  networks  because  these  basic  structures  can  be 
computed  efficiently.  However,  in  multi-layer  networks,  such  structures  become  sig¬ 
nificantly  more  difficult  to  compute,  making  network  survivability  measurement  and 
design  much  more  difficult  in  the  multi-layer  setting.  In  this  section,  we  will  prove 
several  complexity  results  for  the  graph  structures  introduced  in  the  previous  sections. 

Max  Flow  and  Min  Cut 

For  single-layer  networks,  because  the  integral  Max  Flow  and  Min  Cut  values  are 
always  identical  to  the  optimal  relaxed  solutions,  these  values  can  be  computed  in 
polynomial  time  [4],  However,  computing  and  approximating  their  cross-layer  equiv¬ 
alents  turns  out  to  be  much  more  difficult.  Theorem  2.5  describes  the  complexity  of 
computing  the  Max  Flow  and  Min  Cut  for  multi-layer  networks. 
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Theorem  2.5  Computing  Max  Flow  and  Min  Cut  for  multi-layer  networks  is  NP- 
hard.  In  addition,  both  values  cannot  be  approximated  within  any  constant  factor , 
unless  P=NP. 


Proof.  The  Max  Flow  can  be  reduced  from  the  NP-hard  Maximum  Set  Packing  prob¬ 
lem  [48]: 

Maximum  Set  Packing:  Given  a  set  of  elements  IE  —  {ci ,  e2. . . . ,  eTl}  and  a 
family  IF  —  {C\,  C2,  ■  ■  ■ ,  Cm}  of  subsets  ofE,  find  the  maximum  value  k  such  that 
there  exist  k  subsets  {CJ1 ,  Cj2, .  . . ,  Cjk }  C  T  that  are  mutually  disjoint. 

Given  an  instance  of  Maximum  Set  Packing,  we  construct  a  2-node  logical  topology 
connected  by  multiple  lightpaths  as  described  in  Theorem  2.1,  so  that  the  optimal 
value  of  the  Maximum  Set  Packing  instance  equals  the  maximum  number  of  physically 
disjoint  paths  in  the  2-node  logical  topology.  This  means  that  Maximum  Set  Packing 
is  polynomial  time  reducible  to  the  2-node  disjoint  path  problem.  Theorem  2.1  implies 
that  any  instance  of  the  2-node  disjoint  path  problem  is  polynomial  time  reducible 
to  an  instance  of  the  multi-layer  Max  Flow  problem.  It  follows  that  Maximum  Set 
Packing  is  polynomial  time  reducible  to  the  multi-laver  Max  Flow  problem.  Therefore, 
computing  the  multi-layer  Max  Flow  is  NP-Hard. 

Given  an  instance  of  Maximum  Set  Packing  with  ground  set  E  and  a  family  F 
of  subsets  of  E,  we  construct  a  logical  topology  with  two  nodes,  s  and  t,  connected 
by  \F\  logical  links,  where  each  logical  link  corresponds  to  a  subset  in  T .  The  logical 
links  are  embedded  on  the  physical  network  in  a  way  that  two  logical  links  share  a 
physical  liber  if  and  only  if  their  corresponding  subsets  share  a  common  element  in 
the  Maximum  Set  Packing  instance.  It  immediately  follows  that  a  set  of  physically 
disjoint  s  —  t  paths  in  the  logical  topology  corresponds  to  a  family  of  mutually  disjoint 
subsets  of  E. 

Similarly,  the  Min  Cut  can  be  reduced  from  the  NP-hard  Minimum  Set  Cover 
problem  [48 j: 

Minimum  Set  Cover:  Given  a  set  E  =  {e^.e.2, ....  en}  and  a  family  F  = 
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{Ci,  C-2, . . . ,  Cm}  of  subsets  of  E,  find  the  minimum  value  k  such  that  there  exist  k 
subsets  {Ch ,  Ch, . .  .-.Cjk}  C  T  that  cover  E,  i.e.,  (J,-^  fc  C.h  =  E. 

Given  an  instance  of  Minimum  Set  Cover  with  ground  set  E  and  family  of  subsets 
T ,  we  construct  a  logical  topology  that  contains  two  nodes  connected  by  a  set  of  \E\ 
logical  links,  where  each  logical  link  f  corresponds  to  the  element  e,.  The  logical 
links  are  embedded  on  the  physical  network  in  a  way  that  exactly  \E\  fibers,  namely 
{/j, ....  /jjr|},  are  used  by  multiple  logical  links,  and  the  logical  link  f  uses  physical 
fiber  fj  if  and  only  if  e,;  £  C7.  It  follows  that  the  minimum  number  of  physical  fibers 
that  forms  a  cut  between  the  two  logical  nodes  equals  the  size  of  a  minimum  set  cover. 

The  inapproximability  result  follows  immediately  from  the  inapproximabilities  of 
the  Maximum  Set  Packing  and  Minimum  Set  Cover  problems  [11,54,75].  □ 

Minimum  Survivable  Path  Set 

As  discussed  in  Section  2.2.2,  the  size  of  Minimum  Survivable  Path  Set  for  single-layer 
networks  is  either  2  or  oo,  depending  on  whether  the  network  graph  is  bi-connected. 
Therefore,  the  Minimum  Survivable  Path  Set  can  be  easily  computed  in  single-layered 
networks.  In  multi-layer  networks,  the  Minimum  Survivable  Path  Set  can  take  on 
many  different  sizes,  and  computing  its  value  becomes  NP-Hard  and  inapproximable, 
just  like  the  cross-layer  Max  Flow  and  Min  Cut: 

Theorem  2.6  Computing  Minimum  Survivable  Path  Set  for  multi-layer  networks  is 
NP-hard.  fn  addition,  it  cannot  be  approximated  within  any  constant  factor,  unless 
P=NP. 

Proof.  The  NP-Hardness  for  the  Minimum  Survivable  Path  Set  problem  can  be  proved 
by  a  reduction  from  the  Minimum  Set  Cover  problem  similar  to  Theorem  2.5. 
Given  an  instance  of  Minimum  Set  Cover  with  ground  set  E  and  family  of  subsets 
T ,  we  construct  a  logical  topology  that  contains  two  nodes  connected  by  a  set  of  \fF\ 
logical  links,  where  each  logical  link  f  corresponds  to  the  set  Cl  £  T .  The  logical 
links  are  embedded  on  the  physical  network  in  a  way  that  exactly  | E\  fibers,  namely 
{ ./ j . . .  . .  f\E\},  are  used  by  multiple  logical  links,  and  the  logical  link  f  uses  physical 
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fiber  fj  if  and  only  if  e,  §£  C |*  In  this  case,  a  set  of  logical  links  form  a  survivable 
path  set  between  s  and  t  if  and  only  if,  for  any  liber  fj,  there  exists  a  logical  link  f 
in  the  path  set  that  does  not  use  f  j.  This  implies  element  e7  is  covered  by  the  set 
C,  in  the  corresponding  Minimum  Set  Cover  instance.  This  proves  the  NP-Hardness 
and  inapproximity  of  Minimum  Survivable  Path  Set.  □ 

Minimum  Spanning  Tree 

Since  all  spanning  trees  in  a  single-layer  network  have  the  same  number  of  edges, 
computing  a  minimum  spanning  tree  is  trivial.  In  multi-layer  networks,  finding  a 
minimum  (cadinality)  spanning  tree  becomes  an  intractable  problem,  as  described 
in  Theorem  2.7: 

Theorem  2.7  Given  the  lightpath  routing  for  a  'multi-layer  network  Q  =  ( Gp.Gi ), 
finding  its  Minimum  Cross-Layer  Spanning  Tree  is  NP-hard. 

Proof.  We  prove  the  theorem  by  constructing  a  reduction  from  the  NP-Hard  Mini¬ 
mum  Label  Spanning  Tree  problem  [26]: 

Minimum  Label  Spanning  Tree:  Given  a  graph  G  —  (V.  IT),  and  a  set  of  labels 
C  =  {Li, . . . ,  Lm}.  Each  edge  e  £  E  is  associated  with  a  set  of  labels  C,  C  C.  Find 
a  spanning  tree  T  of  G  with  minimum  number  of  labels,  that  is,  the  value  |  U e6T  £e| 
is  minimized. 

Given  an  instance  of  the  Minimum  Label  Spanning  Tree  problem,  we  will  con¬ 
struct  an  instance  of  the  Minimum  Cross-Layer  Spanning  Tree  problem,  such  that 
the  optimal  value  of  the  two  instances  are  preserved.  The  details  of  the  reduction  are 
described  in  Appendix  2.7.3.  □ 

In  summary,  multi-layer  connectivity  exhibits  fundamentally  different  structural 
properties  from  its  single-layer  counterpart.  Because  of  that,  it  is  important  to  rein¬ 
vestigate  issues  of  quantifying,  measuring  as  well  as  optimizing  survivability  in  multi¬ 
layer  networks.  In  the  rest  of  the  chapter,  we  will  focus  on  designing  appropriate 
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metrics  for  layered  networks,  and  developing  algorithms  to  maximize  the  cross-layer 
survivability. 


2.3  Metrics  for  Cross-Layer  Survivability 

The  previous  section  demonstrates  the  new  challenges  in  designing  survivable  lay¬ 
ered  network  architectures.  Insights  into  quantifying  and  optimizing  survivability  are 
fundamentally  different  between  the  single-layer  and  multi-layer  settings.  In  this  sec¬ 
tion,  we  focus  on  the  issue  of  quantifying  survivability  in  multi-layer  networks.  Not 
only  should  such  metrics  have  natural  physical  meaning  in  the  cross-layer  setting, 
they  should  also  be  mathematically  consistent  and  compatible  with  the  conventional 
single-layer  connectivity  metric.  Hence,  we  first  define  formal  requirements  for  metrics 
that  can  be  used  to  quantify  cross-layer  survivability: 

•  Consistency:  A  network  with  a  higher  metric  value  should  be  more  resilient 
to  failures. 

•  Monotonicity:  Any  addition  of  physical  or  logical  links  to  the  network  should 
not  decrease  the  metric  value. 

•  Compatibility:  The  metric  should  generalize  the  connectivity  metric  for  single¬ 
layer  networks.  In  particular,  when  applied  to  the  degenerated  case  where  the 
physical  and  logical  topologies  are  identical,  the  metric  should  be  equivalent  to 
the  connectivity  of  the  topology. 

A  metric  that  carries  all  the  above  properties  would  give  us  a  meaningful  and  consis¬ 
tent  measure  of  survivability  in  the  multi-layer  setting.  We  propose  two  metrics,  the 
Min  Cross  Layer  Cut  and  the  Weighted  Load  Factor ,  that  can  be  used  to  quantify 
survivability  for  multi-layer  networks.  It  is  easy  to  verify  that  both  metrics  satisfy 
the  above  requirements. 
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2.3.1  Min  Cross  Layer  Cut 

In  Section  2.2,  we  defined  MinCutst  to  be  the  minimum  number  of  physical  failures 
that  would  disconnect  logical  nodes  s  and  t.  One  can  easily  generalize  this  by  taking 
the  minimum  over  all  possible  node  pairs  to  obtain  a  global  connectivity  metric.  We 
define  the  Min  Cross  Layer  Cut  (MCLC)  to  be  the  minimum  number  of  physical 
failures  that  would  disconnect  the  logical  topology. 

A  lightpath  routing  wdth  high  Min  Cross  Layer  Cut  value  implies  that  the  net¬ 
work  remains  connected  even  after  a  large  number  of  physical  failures.  It  is  also  a 
generalization  of  the  survivable  lightpath  routing  definition  in  [76] ,  since  a  lightpath 
routing  is  survivable  if  and  only  if  its  Min  Cross  Layer  Cut  is  greater  than  1. 

Let  S'  be  a  subset  of  the  logical  nodes  14,  and  8{S)  be  the  set  of  the  logical  links 
with  exactly  one  end  point  in  S.  Let  Hs  be  the  minimum  number  of  physical  links 
failures  required  to  disconnect  all  links  in  5(S).  The  Min  Cross  Layer  Cut  can  be 
defined  as  follows: 


MCLC  =  min  II s ■ 

SC  Vl 

For  each  S,  computing  Hs  can  be  considered  as  finding  the  Min  Cut  between 
the  two  partitions  S  and  V4  —  S.  In  the  proof  of  Theorem  2.5,  we  have  shown  that 
computing  the  value  of  MinCutst  is  NP-Hard  even  if  the  logical  topology  contains 
just  two  nodes.  This  immediately  implies  that  computing  the  global  MCLC  value  is 
NP-Hard: 


Theorem  2.8  Computing  the  MCLC  for  a  layered  network  is  NP-Hard. 

In  practice,  however,  the  MCLC  is  bounded  by  the  node  degree  of  the  logical 
topology,  which  is  usually  a  small  constant  d.  In  that  case,  the  MCLC  can  be  com¬ 
puted  in  polynomial  time  by  enumerating  all  physical  fiber  sets  with  up  to  d  fibers. 
To  compute  the  MCLC  of  a  layered  network  in  a  general  setting,  it  can  be  modelled 
by  the  following  integer  linear  program. 

Given  the  physical  and  logical  topologies  {Vp.  Ep),  and  i  Vi..  El),  let  ffj  be  binary 
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constants  that  represent  the  lightpath  routing,  such  that  logical  link  (s.  t)  uses  phys¬ 
ical  fiber  (i,j)  if  and  only  if  f?J  =  1.  The  MCLC  can  be  formulated  as  the  integer 
program  below: 

Mmclc  •'  Minimize  ^  //, , .  subject  to: 

{i-.j)eEP 

dt  -ds<  22  Vijfij  v(s-  *)  e  el 

( i,;j)€EP 

22  dn  -  ]-  d0  =  o 

neVc 

dm  i/ij  G  {0,  1}  V??.  G  VL,  (i,  j)  €  Ep 

The  integer  program  contains  a  variable  yri  for  each  physical  link  and  a 

variable  ci/i:  for  each  logical  node  k.  Constraint  (2.3)  maintains  the  following  property 
for  any  feasible  solution:  if  cfi  =  1,  the  node  k  will  be  disconnected  from  node  0  after 
all  physical  links  (i,j)  with  yrj  —  1  are  removed.  To  see  this,  note  that  since  r/fc  =  1 
and  d0  =  0,  any  logical  path  from  node  0  to  node  k  contains  a  logical  link  (s,  t)  where 
ds  =  0  and  dt  =  1.  Constraint  (2.3)  requires  that  such  a  logical  link  traverse  at  least 
one  of  the  fibers  (i,  j)  with  yvj  =  1.  As  a  result,  all  paths  from  node  0  to  node  k  must 
traverse  one  of  these  fibers,  and  node  k  will  be  disconnected  from  node  0  if  these  fibers 
are  removed  from  the  network.  Constraint  (2.4)  requires  node  0  to  be  disconnected 
from  at  least  one  node,  which  ensures  that  the  set  of  fibers  (i,j)  with  yt ,  =  1  forms 
a  global  Cross  Layer  Cut. 

In  Section  2.4,  we  will  use  MCLC  as  the  objective  for  the  survivable  lightpath 
routing  problem,  and  develop  algorithms  to  maximize  this  objective. 

2.3.2  Weighted  Load  Factor 

Another  way  to  measure  the  connectivity  of  a  layered  network  is  by  quantifying  the 
“impact”  of  each  physical  failure.  The  Weighted  Load  Factor  ( WLF ),  an  extension  of 
the  metric  Load  Factor  introduced  in  [60],  provides  such  a  measure  of  survivability. 

Given  the  physical  topology  (VP,  EP)  and  logical  topology  ( VL)EL ),  let  ./?/  be 


(2.3) 

(2.4) 
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binary  constants  that  represent  the  lightpath  routing,  such  that  logical  link  (s,  t)  uses 
physical  fiber  (i.j)  if  and  only  if  ffj  -  1.  The  WLF  can  be  formulated  as  follows: 

Mwlf  •  Maximize  -,  subject  to: 

z 

Y  Z  w*f5 

fs  ,t)e5{S)  (s,t)eS(S) 

VS  C\l.(i.j)C  Er 

Y  “'<■>  >  0  VS  CVL 

(s,t)es(S) 

0  <  wst  <  1  V(.s\7)  G  EL, 

where  S(S)  is  the  cut  set  of  S,  i.e. ,  the  set  of  logical  links  that  have  exactly  one  end 
point  in  S. 

The  variables  ivst  are  the  weights  assigned  to  the  lightpaths.  Over  all  possible 
logical  cuts,  the  variable  ~  measures  the  maximum  fraction  of  weight  inside  a  cut 
carried  by  a  fiber.  Intuitively,  if  we  interpret  the  weight  to  be  the  amount  of  traffic  in 
the  lightpath,  the  value  z  can  be  interpreted  as  the  maximum  fraction  of  traffic  across 
a  set  of  nodes  disrupted  by  a  single  fiber  cut.  The  Weighted  Load  Factor  formulation, 
defined  to  maximize  the  reciprocal  of  this  fraction,  thus  tries  to  compute  the  logical 
edge  weights  that  minimize  the  maximum  fraction.  This  effectively  measures  the 
best  way  of  spreading  the  weight  across  the  fibers  for  the  given  lightpath  routing.  A 
lightpath  routing  with  a  larger  Weighted  Load  Factor  value  means  that,  it  is  more 
capable  of  spreading  its  weight  within  any  cut  across  the  fibers. 

The  Weighted  Load  Factor  also  generalizes  the  survivable  lightpath  routing  defined 
in  [76],  since  its  value  will  be  greater  than  1  if  and  only  if  the  lightpath  routing  is 
survivable. 

Although  the  formulation  MWLF  contains  the  quadratic  terms  zwst,  the  optimal 
value  of  2  can  be  obtained  by  iteratively  solving  the  linear  program  with  different 
fixed  values  of  c.  Using  binary  search  over  the  range  of  2,  we  can  find  the  minimum 
2  where  a  feasible  solution  exists. 
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Computing  the  Weighted  Load  Factor  is  easier  than  computing  MCLC  in  certain 
cases.  For  example,  when  the  logical  topology  contains  only  two  nodes  with  multiple 
logical  links  between  them,  finding  the  Weighted  Load  Factor  can  be  formulated  as  a 
linear  optimization  problem: 


Maximize  ^  wsL.,  subject  to: 

Y  w*tf$  ^  1  Vi /../)( 

{s,t)€EL 

0  <  Wst  <  l  V(s.f)e4 

by  replacing  Mn  the  formulation  Mwlf  by  wm-  It  can  be  easily  verified  that  the 

(sJ)eEl 

two  formulations  are  equivalent  when  the  logical  topology  contains  only  two  nodes. 

Therefore,  for  certain  special  cases  such  as  the  two  node  logical  network,  com¬ 
puting  the  Weighted  Load  Factor  appears  to  be  easier  than  Min  Cross  Layer  Cut. 
However,  in  general,  the  formulation  Mwlf  contains  an  exponential  number  of  con¬ 
straints,  and  may  not  be  polynomial  time  solvable.  In  fact,  Theorem  2.9  states  that 
finding  the  objective  value  for  Mwlf  is  NP-Hard,  even  if  the  weights  of  the  logical 
links  wst  are  given. 

Theorem  2.9  Computing  the  Weighted  Load  Factor  for  a  lightpath  routing  is  NP- 
Hard  even  if  the  weight  assignment  wst  for  the  logical  links  is  fixed. 

Proof.  The  NP-Hardness  proof  is  based  on  the  reduction  from  the  NP-Hard  Uniform 
Sparsest  Cut  /7/  problem.  For  details,  see  Appendix  2.7.4.  □ 

Finally,  Theorem  2.10  describes  the  relationship  between  the  WLF  and  the 
MCLC.  Given  a  lightpath  routing,  let  Mmclc  be  the  ILP  formation  for  its  Min  Cross 
Layer  Cut,  and  let  MCLC  and  MCLCR  be  the  optimal  values  for  Mmclc  and  its 
linear  relaxation  respectively.  In  addition,  let  IF LF  be  the  Weighted  Load  Factor  of 
the  lightpath  routing.  Then  we  have  the  following  relationship: 
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Theorem  2.10  l\ICLCR  <  WLF  <  MCLC. 

Proof.  See  Appendix  2.7.5.  □ 

Therefore,  although  the  two  metrics  appear  to  measure  different  aspects  of  network 
connectivity,  they  are  inherently  related.  In  fact,  as  we  will  see  in  Section  2.5,  the 
two  values  are  often  identical.  The  connection  between  the  two  metrics  thus  provides 
insights  into  the  development  of  the  lightpath  routing  formulation  MCFlf,  to  be 
introduced  in  Section  2.4.2. 

As  a  concluding  remark  of  this  section.  The  two  metrics  introduced  in  this  section 
are  both  NP-hard  to  compute.  It  remains  an  interesting  open  question  whether 
any  meaningful  cross-layer  survivability  metrics  that  is  polynomial  time  computable 
exists. 


2.4  Lightpath  Routing  Algorithms  for  Maximizing 
MCLC 

In  this  section,  we  consider  the  survivable  lightpath  routing  problem  using  the  Min 
Cross  Layer  Cut  as  the  objective.  At  an  abstract  level,  the  optimal  lightpath  routing 
can  be  expressed  as  the  following  optimization  problem: 

max  min  MFCif.  S% 

/€W  SCVL 

where  fF  is  set  of  all  possible  lightpath  routings,  VL  is  the  logical  node  set,  and 
MFC(f,  S)  is  the  minimum  number  of  fibers  whose  removal  will  disconnect  all  log¬ 
ical  links  in  the  cut  set  S(S)  given  the  lightpath  routing  /.  This  is  a  Max-Min-Min 
problem  that  may  not  have  a  simple  formulation.  In  Section  2.4.1,  we  first  present  an 
ILP  formulation  that  maximizes  the  MCLC  for  the  lightpath  routing.  However,  the 
formulation  has  a  large  number  of  variables  and  is  diffcult  to  solve  in  practice.  There¬ 
fore,  in  Section  2.4.2  we  will  present  several  simpler  formulations  that  approximate 
MCLC  maximization. 
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2.4.1  ILP  for  MCLC  Maximization 


We  first  present  a  survivable  lightpath  routing  ILP  that  maximizes  the  MCLC  value: 


1.  Parameters: 


•  Gp  =  ( Vp.Ep ):  Physical  topology. 

•  Gl  =  (14,  El):  Logical  topology. 

•  d:  The  minimum  cut  of  the  logical  topology. 

•  C:  The  family  of  all  possible  subsets  of  physical  fibers  with  size  at  most  d. 

•  Wf.  A  weight  associated  to  each  fiber  set  with  size  i: 

{  l  ifi  =  |Ep|, 


U;  = 


if  1  <  i  <  | /•,’/>  |  —  1. 


2.  Variables: 

•  fi  -  G  {0, 1}  for  (s,t)  G  El.  ( i.  j )  G  EP:  Represents  the  lightpath  routing, 
where  ffj  =  1  if  and  only  if  logical  link  (s,t)  uses  fiber 

•  yc  G  [0. 1]  for  C  G  C:  Represents  whether  the  fiber  set  C  is  a  cross-layer 
cut.  The  fiber  set  C  is  a  cross-layer  cut  if  and  only  if  its  value  is  1. 

•  xsf'  G  [0,1],  for  (s.t)  G  El.v  G  14  —  {0}  ,  C  G  C:  Flow  variable  on  the 
surviving  logical  topology  when  fibers  in  C  fail.  This  is  used  to  express  the 
connectedness  of  the  surviving  logical  topology  under  this  set  of  physical 
failures. 
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3.  Formulation: 


V  r‘'C 

/  >  xst 
t:{s,i.)&EL 


MCLC_MAX  :  Minimize  W,.  yc,  subject,  to: 

i=l  CeC:|C|=i 


E 


.t'.C 


V(i,  j)  <E  C,  (s,  t )  G  /•./..  r  G  Vi  -  {0}  ;  C  G  C 

(2.5) 

1  —  yc-  if  s  =  0 

yc  —  1,  if  s  =  v,  'v'v  G  Vi  —  {0}  ,C  GC 

0,  otherwise. 


(2.6) 

{/*'  ■  (*,/)  ^  £  /' }  forms  an  (.s.  f)-path,  Vfs,  f)  £ 

f%  G  {0,  1}  ,  xif  >  0.  0  <  yc  <  1. 


The  objective  of  the  formulation  is  to  minimize  the  total  weighted  sum  of  the 
cross-layer  cuts.  Since  I  V,;  is  defined  in  a  way  that  the  weight  of  a  cross-layer  cut  with 
size  i  dominates  the  total  weights  of  all  cross-layer  cuts  with  size  greater  than  i,  the 
formulation  will  avoid  creating  a  lightpath  routing  with  small  cross-layer  cuts.  As  a 
result,  the  optimal  solution  will  have  a  maximum  MCLC  value.  In  addition,  since  the 
connectivity  of  the  logical  topology  is  d,  the  MCLC  value  of  any  lightpath  routing  is 
at  most  d.  Therefore,  it  is  sufficient  to  have  the  objective  consider  physical  fiber  sets 
with  size  up  to  d. 

By  Constraints  (2.5)  and  (2.6),  the  variable  x^f  represents  the  amount  of  flow 
sent  from  logical  node  0  to  node  v  along  the  logical  link  (s.t),  under  the  scenario 
where  fibers  in  C  fail,  causing  all  logical  links  that  use  these  fibers  to  fail.  Specifi¬ 
cally,  Constraint  (2.5)  makes  sure  that  a  positive  flow  can  be  assigned  to  logical  link 
(s.t)  only  if  the  logical  link  (s,t)  does  not  use  any  of  the  physical  fiber  (i.j)  G  C. 
In  other  words,  only  the  surviving  logical  links  under  the  failure  event  C  can  be 
used.  Constraint  (2.6)  is  the  flow  conservation  constraint  on  the  logical  topology  with 
flow  value  1  —  yc-  If  the  logical  topology  remains  connected  under  the  failure  event 
C,  a  positive  flow  can  be  sent  from  node  0  to  any  other  node  v,  and  yc  can  therefore 
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be  set  to  0.  On  the  other  hand,  if  the  logical  topology  is  disconnected,  node  0  will  be 
disconnected  to  some  logical  node  v,  in  which  case  yc  has  to  be  set  to  1  since  no  flow 
can  be  sent  between  the  two  nodes.  Since  the  objective  is  to  minimize  the  weighted 
sum  of  yc,  the  variable  yc  will  be  set  to  0  unless  the  logical  topology  is  disconnected. 
Therefore,  the  variable  yc  represents  whether  C  is  a  cross-layer  cut.  This  is  true  even 
if  the  binary  constraint  on  yc  is  relaxed. 

2.4.2  Approximate  Formulations 

Although  MCLC_MAX  gives  us  an  exact  formulation  to  maximize  MCLC,  the  formu¬ 
lation  may  have  a  large  number  of  variables  and  constraints,  and  is  therefore  infeasible 
to  solve  in  practice,  even  if  all  the  integer  variables  are  relaxed.  Therefore,  for  the  rest 
of  the  section,  we  consider  approximate  formulations  whose  objective  values  are  lower 
bounds  to  the  MCLC.  These  formulations  are  much  simpler  than  MCLC_MAX.  This 
makes  it  possible  to  develop  survivable  lightpath  routing  algorithms  based  on  these 
simpler  formulations.  In  particular,  in  Section  2.4.3  we  discuss  how  to  use  random¬ 
ized  rounding  [90]  based  on  these  formulations  as  a  heuristic  to  approximate  MCLC 
maximization.  Note  that  since  MCLC  is  O(logn)  inapproximable,  polynomial  time 
algorithms  with  approximation  guarantees  within  this  factor  are  unlikely  to  exist. 
Therefore,  we  will  instead  evaluate  the  performance  of  our  algorithms  via  simulation 
in  Section  2.5. 

All  of  the  formulations  introduced  in  this  section  are  based  on  multi-commodity 
flows,  where  each  lightpath  is  considered  a  commodity  to  be  routed  over  the  phys¬ 
ical  network.  Given  the  physical  network  Gp  =  (Vp,  FJp)  and  the  logical  network 
Gl  —  (Cl,  Ep),  the  multi-commodity  flow  for  a  lightpath  routing  can  be  generally 
formulated  as  follows: 


MCF*  :  Minimize  A’(/).  subject  to: 

e  {0,1} 

{fij  :  (i,j)  E  EP}  forms  an  (s,t)~ path,  V(s,  t)  G  ELr  (2.7) 


50 


where  /  is  the  variable  set  that  represents  the  lightpath  routing,  such  that  =  1 
if  and  only  if  lightpath  (s.t)  uses  physical  fiber  (i,j)  in  its  route;  and  the  objective 
X(f)  is  a  function  of  the  lightpath  routing  /  that  captures  the  survivability  of  the 
layered  network. 

For  WDM  networks  where  the  wavelength  continuity  constraint  is  present  [29,110], 
the  above  formulation  can  be  extended  to  capture  the  wavelength  assignment  aspect. 
In  that  case,  the  wavelength  assignment  can  be  modelled  by  replacing  the  variable  set 
ff-  by  which  equals  1  if  and  only  if  lightpath  (s,  t)  uses  wavelength  A  on  physical 

link  Constraint  (2.7)  can  be  easily  extended  to  restrict  that,  for  each  logical 

link  (s,t),  { ,/T/a  =  1}  forms  an  (.s.  t.)  physical  path  along  one  of  the  wavelengths.  To 
make  sure  that  any  wavelength  A  on  a  physical  fiber  is  used  by  at  most  one  lightpath, 
the  following  constraint  will  be  added: 

£  f‘lx<  1  Vftj)  e  £>,VA.  (2.8) 

{s,t)£EL 

Similar  formulations  based  on  multi-commodity  flows  with  wavelength  continuity 
constraint  have  been  proposed  to  solve  the  Routing  and  Wavelength  Assignment 
(RWA)  problem  of  WDM  networks  [14,85],  where  the  objective  is  to  minimize  the 
number  of  lightpaths  that  traverse  the  same  fiber.  The  key  difference  in  the  problem 
studied  in  this  chapter  is  in  the  objective  function  X,  which  should  instead  describe 
the  survivability  of  the  lightpath  routing.  To  focus  on  the  survivability  aspect  of  the 
problem,  the  wavelength  continuity  constraint  will  be  omitted  in  the  formulations 
below.  However,  in  cases  where  the  wavelength  continuity  constraint  is  necessary,  all 
these  formulations  can  be  extended  as  discussed  above. 

Simple  Multi-Commodity  Flow  Formulations 

Ideally,  to  ensure  that  the  lightpath  routing  is  survivable  against  the  largest  number  of 
failures,  the  objective  function  X(f)  should  express  the  MCLC  value  of  the  lightpath 
routing  given  by  /.  However,  since  simple  formulations  to  maximize  the  MCLC 
directly  are  difficult  to  find,  we  use  an  objective  that  approximates  the  MCLC  value. 
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In  our  formulation,  each  lightpath  is  assigned  a  weight  w.  The  objective  function 
p.w  measures  the  maximum  load  of  the  fibers,  where  the  load  is  defined  to  be  the 
total  lightpath  weight  carried  by  the  fiber.  The  intuition  is  that  the  multi-commodity 
flow  formulation  will  try  to  spread  the  weight  of  the  lightpaths  across  multiple  fibers, 
thereby  minimizing  the  impact  of  any  single  fiber  failure. 

We  can  formulate  an  integer  linear  program  with  such  an  objective  as  follows: 

MCFU,  :  Minimize  pw,  subject  to: 

Pw>  ,r(s- 1  ]fu  v(t?)  G  £’p 

( s,t)eEL 

ftj  €  {0,  1} 

{fij  :(bj)  £  T'p}  forms  an  (s.f)-path,  V(s,f)  e  E!j 

As  we  will  prove  in  Theorem  2.11,  with  a  careful  choice  of  the  weight  function  w,  the 
value  —  gives  a  lower  bound  on  the  MCLC.  Therefore,  a  lightpath  routing  with  a 
low  pw  value  is  guaranteed  to  have  a  high  MCLC. 

The  routing  strategy  of  the  algorithm  is  determined  by  the  weight  function  w. 
For  example,  if  w  is  set  to  1  for  all  lightpaths,  the  integer  program  will  minimize  the 
number  of  lightpaths  traversing  the  same  fiber.  Effectively,  this  will  minimize  the 
number  of  disconnected  lightpaths  in  the  case  of  a  single  fiber  failure. 

In  order  to  customize  MCFW  towards  maximizing  the  MCLC  of  the  solution,  we 
propose  a  different  weight  function  WMmCui  that  captures  the  connectivity  structure 
of  the  logical  topology.  For  each  edge  (s.t)  e  EL,  we  define  WMinCut(s,t )  to  be 
|.Uir;CuV(.rdp  W^ere  MinCutji(s.t)  is  the  minimum  (s.f)-cut  in  the  logical  topology. 
Therefore,  if  an  edge  (s,  t)  belongs  to  a  smaller  cut,  it  will  be  assigned  a  higher  weight. 
The  algorithm  will  therefore  try  to  avoid  putting  these  small  cut  edges  on  the  same 
fiber. 

If  WMinCui  is  used  as  the  weight  function  used  in  MCF„;,  we  can  prove  the  following 
relationship  between  the  objective  value  pw  of  a  feasible  solution  to  MCF^  and  the 
Weighted  Load  Factor  of  the  associated  lightpath  routing: 
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Theorem  2.11  For  any  feasible  solution  f  of  MCFW  with  <r  .%/,,><  ■<,*  as  the  weight  func¬ 
tion ,  —  <  WLF. 

■  P„ :  — 

Proof.  By  definition  of  the  weight  function  u'MinCut-.  given  any  S  C  14,  every  edge  in 
fi(S)  has  weight  at  least  Therefore,  we  have: 

E  E  |J(5)|  =  1 

{s.L)e&{S)  (s.t)eS(S)  1  v  n 

Now  consider  the  lightpath  routing  associated  with  /.  For  any  logical  cut  5{S), 
the  maximum  fraction  of  weight  inside  the  cut  carried  by  a  fiber  is: 


E  w(s,  t)ffj 

(s,i)e5(s) 

max  - = - 7 - 7— 

(ij)eKp  2^  w(s.l) 

wMfiiA 

iJ  Lp  (s.t)eS(S) 

<  max  w(Ft)ff- 

(s.i)eEL 


by  Equation  (2.9) 


In  other  words,  no  fiber  in  the  network  is  carrying  more  than  a  fraction  pw  of  the 
weight  in  any  cut.  This  gives  us  a  feasible  solution  to  the  Weighted  Load  Factor 
formulation  Mwlf,  where  each  variable  wst  is  assigned  the  value  of  WMiricut{s-,t),  and 
the  variable  z  is  assigned  the  value  of  pw.  As  a  result,  the  Weighted  Load  Factor, 
defined  to  be  the  maximum  value  of  \  among  all  feasible  solutions  to  MWLF,  must  be 
at  least  — .  □ 

Pw 

As  a  result  of  Theorems  2.10  and  2.11,  the  MCLC  of  a  lightpath  routing  is  lower 
bounded  by  the  value  of  j-,  which  the  algorithm  will  try  to  maximize. 

Enhanced  Multi-Commodity  Flow  Formulation 

As  we  have  discussed  in  Section  2.3.2,  the  Weighted  Load  Factor  provides  a  good 
lower  bound  on  the  MCLC  of  a  lightpath  routing.  Here  we  propose  another  multi- 
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commodity  flow  based  formulation  whose  objective  function  approximates  the  Weighted 
Load  Factor  of  a  lightpath  routing.  The  formulation,  denoted  as  MCFLf,  can  be  writ¬ 
ten  as  follows: 


MCFlf  :  Minimize  7,  subject  to: 

7WS)|>  >_j  IS  V{i,j)eTSF,ScVL 

fij  e  {0, 1} 

{fij  :(lj)  e  Ep]  forms  an  (s,  /)-pat  h,  V(s,t)  G  EL 

Essentially,  the  formulation  optimizes  the  unweigthed  Load  Factor  of  the  lightpath 
routing,  (i.e.,  all  weights  equal  one),  by  minimizing  the  maximum  fraction  of  a  logical 
cut  carried  by  a  single  fiber.  As  this  formulation  provides  a  constraint  for  each 
logical  cut,  it  captures  the  impact  of  a  single  fiber  cut  on  the  logical  topology  in 
much  greater  detail.  The  following  theorem  shows  that  for  any  lightpath  routing,  its 
associated  Load  Factor  value  7  gives  a  tighter  lower  bound  than  given  by  the 
MCFU,  formulation. 

Theorem  2.12  For  any  lightpath  routing,  let  pw  be  its  associated  objective  value  in 
the  formulation  MCF^,  with  wJ\lincut  as  the  weight  function,  and  let  7  be  its  associated 
objective  value  in  the  formulation  MCFi^.  In  addition,  let  WLF  be  its  Weighted  Load 
Factor.  Then: 

—  <  -  <  WLF. 

pw  7 

Proof.  The  value  7  is  the  objective  value  for  the  formulation  MWLF  in  Section  2.3.2 
when  all  logical  links  have  weight  1.  This  gives  a  feasible  solution  to  Mwlf,  and 
implies  that  WLF  >  7. 

To  prove  that  7-  <  7,  we  consider  the  physical  link  (L  j)  and  logical  cut  set  5(S) 
where  j )  carries  a  fraction  7  of  the  logical  links  in  S(S).  Let  L,:i  be  the  set  of  logical 
links  in  FJp  carried  by  (i,  j).  Therefore,  we  have  7  —  [n  addition,  by  the 
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definition  of  pW:  we  have 


This  implies  —  <  - 

Therefore,  the  formulation  MCFLf  gives  a  lightpath  routing  that  is  optimized  for  a 
better  lower  bound  on  the  MCLC.  However,  this  comes  at  the  cost  of  a  larger  number 
of  constraints  and  solving  such  an  integer  program  may  not  be  feasible  in  practice. 
Therefore,  we  next  introduce  a  randomized  rounding  technique  that  approximates 
the  optimal  lightpath  routing  by  solving  the  linear  relaxation  of  the  integer  program. 
As  we  will  see  in  Section  2.5,  the  randomized  rounding  technique  significantly  speeds 
up  the  running  time  of  the  algorithm  without  observable  degradation  in  the  MCLC 
performance.  This  offers  a  practical  alternative  to  solving  the  integer  program  for¬ 
mulations  introduced  in  this  section. 

2.4.3  Randomized  Rounding  for  Lightpath  Routing 

While  the  multi-commodity  flow  integer  program  formulations  discussed  in  the  pre¬ 
vious  section  introduce  a  novel  way  to  route  lightpaths  in  a  survivable  manner,  such 
an  approach  may  not  scale  to  large  networks,  due  to  the  inherent  complexity  of  solv¬ 
ing  integer  programs.  In  order  to  circumvent  the  computational  difficulty,  we  apply 
the  randomized  rounding  technique,  which  is  able  to  quickly  obtain  a  near-optimal 
solution  to  the  integer  program.  Randomized  rounding  has  previously  been  used 
to  solve  multi-commodity  flow  problems  to  minimize  the  link  load  [14,  90],  and  its 
performance  guarantee  is  studied  in  [90] . 
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Given  any  multi-commodity  flow  based  integer  formulation,  the  following  algo¬ 
rithm  RANDOMk  describes  the  randomized  rounding  algorithm  that  computes  a  light- 
path  routing  based  on  the  formulation. 


Algorithm  1  RANDOM|< 

1:  Compute  the  optimal  fractional  solution  /  to  the  linear  relaxation  of  the  multi- 
commodity  flow  integer  program.  For  each  lightpath  (s,i),  the  values  of 
represent  a  flow  from  s  to  t  with  a  total  flow  value  of  1. 

2:  For  each  lightpath  (s,t),  decompose  the  solution  f?j  into  flow  paths,  each  with 
weight  equal  to  the  flowr  value  of  the  path. 

3:  for  i  —  1,  2, ....  k,  do: 

Create  a  random  lightpath  routing  R{.  For  each  lightpath  (s,t),  randomly  pick 
one  path  from  the  set  of  flow  paths  generated  in  Step  2,  using  the  path  weights 
as  the  probabilities. 

4:  Return  the  Ri  with  the  highest  Min  Cross  Layer  Cut  value. 


The  parameter  k  specifies  the  number  of  trials  in  the  process  of  random  lightpath 
routing  generation.  The  higher  the  value  of  k,  the  more  likely  the  algorithm  will 
encounter  a  lightpath  routing  with  a  high  MCLC  value. 

Although  the  last  step  requires  the  MCLC  computation  of  the  lightpath  rout¬ 
ings,  the  integer  program  MMClc  contains  only  \EP\  binary  variables,  which  is  much 
fewer  than  the  \Ep\[EL\  variables  contained  in  the  multi-commodity  flow  formula¬ 
tions.  Therefore,  the  randomized  algorithm  runs  considerably  faster  than  the  integer 
program  algorithm.  In  the  next  section,  we  will  compare  the  performance  of  the  two 
algorithms,  both  in  terms  of  running  time  and  quality  of  the  solution. 


2.5  Simulation 

In  this  section,  we  discuss  our  simulation  results  for  the  algorithms  introduced  in  Sec¬ 
tion  2.4.  We  first  compare  the  lightpath  routing  algorithms  by  solving  the  ILP  di¬ 
rectly  and  by  randomized  rounding.  Next,  we  compare  the  survivability  performance 
among  different  formulations.  Finally^,  we  investigate  the  different  lower  bounds  of 
MCLC,  and  their  effects  on  the  MCLC  value  of  the  lightpath  routing  when  used  as 
an  optimization  objective. 
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ILP  vs  Randomized  Rounding 

In  this  experiment,  we  use  the  NSFNET  (Figure  2-3)  as  the  physical  topology.  The 
network  is  augmented  to  have  connectivity  4,  which  makes  it  possible  to  study  the 
performance  of  the  algorithms  where  a  higher  MCLC  value  is  possible.  We  generated 
350  random  logical  topologies  with  connectivity  at  least  4,  and  size  ranging  from  6  to 
12  nodes.  Using  the  formulation  MCFtt,  with  weight  function  i UMinCut{s,  t)  introduced 
in  Section  2.4.2  as  our  benchmark,  we  compare  the  performance  of  RANDOM10  against 
solving  the  ILP  optimally. 


Figure  2-3:  The  augmented  NSFNET.  The  dashed  lines  are  the  new  links. 

Table  2.1  compares  the  average  running  time  between  the  algorithms  ILP  and 
RANDOMxo  on  various  logical  topology  size.  All  simulations  are  run  on  a  Xeon 
E5420  2.5GHz  workstation  with  4GB  of  memory,  using  CPLEX  to  solve  the  integer 
and  linear  programs.  As  the  number  of  logical  nodes  increases,  the  running  time  for 
the  integer  program  algorithm  ILP  increases  tremendously.  On  the  other  hand,  there 
is  no  observable  growth  in  the  average  running  time  for  the  algorithm  RANDOM10, 
which  is  less  than  a  minute.  In  fact,  our  simulation  on  larger  networks  shows  that 
the  algorithm  ILP  often  fails  to  terminate  within  a  day  when  the  network  size  goes 
beyond  12  nodes.  On  the  other  hand,  the  algorithm  RANDOM10  for  MCF,j;  is  able  to 
terminate  consistently  within  2  hours  for  very  large  instances  with  a  100-node  physical 
topology  and  50-node  logical  topology.  This  shows  that  the  randomized  approach  is 
a  much  more  scalable  solution  to  compute  survivable  lightpath  routings. 

In  Figure  2-4,  the  survivability  performance  of  the  randomized  algorithm  is  com¬ 
pared  with  its  ILP  counterpart.  Each  data  point  in  the  figure  is  the  MCLC  average 
of  50  random  instances  with  the  given  logical  network  size.  As  our  result  shows,  the 
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Logical  Topology  Size 

Average  Running  Time  (seconds) 

ILP 

RANDOM10 

6 

33.2 

31.9 

7 

50.5 

33.9 

8 

660.0 

30.1 

9 

1539.0 

26.4 

10 

3090.6 

32.3 

11 

8474.5 

32.0 

12 

15369.7 

29.7 

Table  2.1:  Average  running  time  of  ILP  and  RANDOMiq. 


lightpath  routings  produced  by  RAN  DOM  hj  have  higher  MCLC  values  than  solving 
the  ILP  optimally.  This  is  because  the  objective  value  for  ILP  MCFW  is  a  lower  bound 
on  MCLC.  As  we  will  see  in  Section  2.5,  this  lower  bound  is  often  not  tight  enough  to 
accuratelv  reflect  the  M^CLC  value,  which  means  that  the  optimal  solution  to  the  ILP 
does  not  necessarily  yield  a  lightpath  routing  with  maximum  MCLC.  On  the  other 
hand,  the  randomized  algorithm  generates  lightpath  routings  non-deterministically 
based  on  the  optimal  fractional  solution  of  MCF^.  Therefore,  it  approximates  the 
lightpath  routing  given  by  the  ILP,  with  an  additional  randomization  component  to 
explore  better  solutions.  When  the  randomized  rounding  process  is  repeated  many 
times,  the  algorithm  often  encounters  a  solution  that  is  even  better  than  the  one  given 
by  the  ILP. 


Figure  2-4:  MCLC  performance  of  randomized  rounding  vs  ILP. 
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To  sum  up,  randomized  rounding  provides  an  efficient  alternative  to  solving  integer 
programs  without  observable  quality  degradation.  This  allows  us  to  experiment  with 
more  complex  formulations  in  larger  networks  where  solving  the  integer  programs 
optimally  is  infeasible.  In  the  next  section,  we  will  compare  the  different  formulations 
introduced  in  Section  2.4.2,  using  randomized  rounding  to  compute  the  lightpath 
routings. 

Lightpath  Routing  with  Different  Formulations 

In  this  experiment,  we  study  the  survivability  performance  of  the  lightpath  routings 
generated  by  the  formulations  introduced  in  Section  2.4.  We  use  the  24-node  USIP 
network  (Figure  2-5),  augmented  to  have  connectivity  4,  as  the  physical  topology.  We 
generate  500  random  graphs  with  connectivity  4  and  size  ranging  from  6  to  15  nodes 
as  logical  topologies. 


Figure  2-5:  The  augmented  USIP  network.  The  dashed  lines  are  the  new  links. 

We  compare  the  MCLC  performance  of  the  lightpath  routings  generated  by  the 
randomized  lounding  algorithm,  RAI\IDOMk)o,  on  the  following  formulations: 

1.  Multi-Commodity  Flow  MCFW,  using  identity  function  as  the  weight  function, 
i.e.,  w(s.  t)  =  1  for  all  (s,f)  G  Ei  (Identity); 

2.  Multi-Commodity  Flow  MCF,„,  using  the  weight  function  wMlnCuL\  introduced 
in  Section  2.4.2  (MinCut); 

3.  Enhanced  Multi-Commodity  Flow  MCFlf  (LF). 
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For  comparison,  we  also  run  randomized  rounding  on  the  Survivable  Lightpath 
Routing  formulation  (SURVIVE),  introduced  in  [76],  which  computes  the  lightpath 
routing  that  minimizes  the  total  fiber  hops,  subject  to  the  constraint  that  the  MCLC 
must  be  at  least  two. 

Figure  2-6  compares  the  average  MCLC  values  of  the  lightpath  routings  computed 
by  the  four  different  algorithms.  Overall,  the  formulations  introduced  in  this  chapter 
achieve  better  survivability  than  SURVIVE.  This  is  because  these  formulations  try  to 
maximize  the  MCLC  in  their  objective  functions,  whereas  SURVIVE  minimizes  the 
physical  hops.  Therefore,  even  though  SURVIVE  does  well  in  finding  a  survivable 
routing  (i.e.  MCLC>2),  the  new  formulations  are  able  to  achieve  even  higher  MCLC 
values,  which  allow  more  physical  failures  to  be  tolerated. 

To  further  verify  the  survivability  performance  of  the  lightpath  routings  from  a 
different  perspective,  for  each  lightpath  routing,  we  simulated  the  scenario  where  each 
physical  link  fails  independently  with  probability  0.01.  Figure  2-7  shows  the  average 
probability  that  the  logical  topology  becomes  disconnected  under  this  scenario.  The 
result  is  consistent  with  Figure  2-6,  as  lightpaths  routings  with  higher  MCLC  values 
can  tolerate  more  physical  failures,  and  the  logical  topologies  are  thus  more  likely  to 
stay  connected. 


Figure  2-6:  MCLC  performance  of  different  lightpath  routing  formulations. 
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Figure  2-7:  Probabilty  that  logical  topology  becomes  disconnected  if  physical  links  fail  independently 
with  probability  0.01. 

The  quality  of  the  lightpath  routing  also  depends  on  the  graph  structures  captured 
by  the  formulations.  Compared  with  MCF|dentity,  the  formulation  MCFMinCut  uses  a 
weight  function  that  captures  the  connectivity  structure  of  the  logical  topology.  As 
a  result,  the  algorithm  will  try  to  avoid  putting  edges  that  belong  to  smaller  cuts 
onto  the  same  physical  link,  thereby  minimizing  the  impact  of  a  physical  link  failure 
on  these  critical  edges.  This  allows  the  algorithm  MCFMinCut  to  produce  lightpath 
routings  with  higher  MCLC  values  than  MCF|dentity- 

The  enhanced  formulation  MCFlf  captures  the  connectivity  structure  of  the  logical 
topology  in  much  greater  detail,  by  having  a  constraint  to  describe  the  impact  of  a 
physical  link  failure  to  each  logic  cut.  As  a  result,  the  algorithm  based  on  this 
formulation  is  able  to  provide  lightpath  routings  with  the  highest  MCLC  values. 

Lower  Bound  Comparison 

In  Theorem  2.12  we  establish  different  lower  bounds  for  the  MCLC.  In  this  experi¬ 
ment,  we  measure  these  lower  bound  values  for  500  different  lightpath  routings,  and 
compare  them  to  the  actual  MCLC  values. 

As  Figure  2-8  shows,  the  Weighted  Load  Factor  is  a  very  close  approximation  of 
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the  Min  Cross  Layer  Cut.  Among  the  500  routings  being  investigated,  the  two  metrics 
are  identical  in  368  cases.  This  suggests  a  tight  connection  between  the  two  metrics, 
which  also  justifies  the  choice  of  such  metrics  as  survivability  measures. 

The  figure  also  reveals  a  strong  correlation  between  the  MCLC  performance  and 
the  tightness  of  the  lower  bounds  given  by  the  multi-commodity  flow  formulations 
in  Section  2.4.2.  Compared  to  MCF.(U,  the  formulation  MCFlf  provides  an  objective 
value  that  is  closer  to  the  actual  MCLC  value  of  the  lightpath  routing.  This  translates 
to  better  lightpath  routings,  as  we  saw  in  Figure  2-6.  Since  there  is  still  a  large  gap 
between  the  MCFlf  objective  value  and  the  MCLC  value,  this  suggests  room  for 
further  improvement  with  a  formulation  that  gives  a  better  MCLC  lower  bound. 

To  summarize  this  section,  a  good  formulation  that  properly  captures  the  cross¬ 
layer  connectivity  structure  is  essential  for  generating  lightpath  routings  with  high 
survivability.  Combined  with  randomized  rounding,  it  gives  a  powerful  tool  for  de¬ 
signing  highly  survivable  layered  networks. 


Figure  2-8:  Comparison  among  Min  Cross  Layer  Cut  (MCLC),  Weighted  Load  Factor  (WLF)  ,  and 
the  optimal  values  of  ILPlf  and  ILPiviinCut- 
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2.6  Conclusion 


In  this  chapter,  we  introduce  the  problem  of  maximizing  the  connectivity  of  layered 
networks.  We  show  that  survivability  metrics  in  multi-layer  networks  have  signifi¬ 
cantly  different  meaning  than  their  single-layer  counterparts.  We  propose  two  surviv¬ 
ability  metrics,  the  Min  Cross  Layer  Cut  and  the  Weighted  Load  Factor,  that  measure 
the  connectivity  of  a  multi-layer  network,  and  develop  linear  and  integer  formulations 
to  compute  these  metrics.  In  addition,  we  use  the  metric  Min  Cross  Layer  Cut  as  the 
objective  for  the  survivable  lightpath  routing  problem,  and  develop  multi-commodity 
flow  formulations  to  approximate  this  objective.  We  show,  through  simulations,  that 
our  algorithms  produce  lightpath  routings  with  significantly  better  Min  Cross  Layer 
Cut  values  than  existing  survivable  lightpath  routing  algorithms. 

Our  simulations  show  that  a  good  formulation,  combined  with  the  randomized 
rounding  technique,  provides  a  powerful  tool  for  generating  highly  survivable  layered 
networks.  Therefore,  an  important  direction  for  future  research  is  to  establish  a  better 
formulation  for  the  lightpath  routing  problem  that  maximizes  the  Min  Cross  Layer 
Cut.  The  multi-commodity  flow  formulation  introduced  in  this  chapter  approximates 
the  Min  Cross  Layer  Cut  by  using  its  lower  bound  as  the  objective  function.  However, 
this  lower  bound  is  often  not  very  close  to  the  actual  Min  Cross  Layer  Cut  value. 
A  better  objective  function,  such  as  the  Weighted  Load  Factor,  would  significantly 
improve  the  proposed  lightpath  routing  algorithms. 

The  similarity  between  the  Min  Cross  Layer  Cut  and  the  Weighted  Load  Factor  is 
also  intriguing.  Our  simulation  results  demonstrated  a  very  tight  connection  between 
the  two  metrics.  This  observation  might  reflect  certain  property  of  cross-layer  network 
connectivity  that  are  yet  to  be  discovered  and  formalized.  A  better  understanding  of 
how  these  metrics  relate  to  each  other  will  possibly  lead  to  important  insights  into 
the  cross-layer  survivability  problem. 
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2.7  Chapter  Appendix 


2.7.1  Proof  of  Theorem  2.1 

Theorem  2.1:  Let  G p  be  a  logical  topology  with  two  nodes  s  and  t,  connected  by  n 
lightpaths  Ep  =  {e1;  e2, . . . ,  en},  and  let  1Z  —  {Ri,  R2,  •  ■  ■  •  Rk}  be  a  family  of  subsets 
of  El  where  each  |  Ii, j  >  2.  There  exists  a  physical  topology  Gp  =  ( Vp,  EP )  and 
lightpath  routing  of  GL  over  Gp,  such  that: 

1.  there  are  exactly  k  fibers  in  EP ,  denoted  by  F  =  {f\.  /2, ....  fk},  that  are  used 
by  multiple  lightpaths; 

2.  for  each  fiber  /,  £  F,  the  set  of  lightpaths  using  the  fiber  /'  is  Ri. 

Proof.  Given  a  logical  topology  Gp  —  ( Vl,El )  with  two  nodes  s  and  t  connected 
by  n  lightpaths  Bp  —  {e1,c2....,  en},  and  1 Z  —  (Fi,  F2, . . . ,  Rk}  be  the  family  of 
subsets  of  Ep,  we  construct  a  physical  topology  and  lightpath  routing  that  satisfy  the 
conditions  specified  in  the  theorem. 

•  Physical  Topology: 

The  physical  topology  contains  the  two  end  nodes  s  and  t  in  the  logical  network. 
In  addition,  between  the  two  end  nodes,  there  are  n  groups  of  nodes.  Each  group 
4  containing  k  T  1  nodes,  namely  xl0,x\, . . . ,  xlk.  For  any  i  £  {1, . . . ,  n}  .  j  £ 
{ 1 , . . .  A:},  there  is  an  edge  connecting  nodes  :r'  ,  and  x}.  In  addition,  s  is 
connected  to  Xq  and  x'k  is  connected  to  t  for  all  i  £  (1, . . . ,  n}.  In  other  words, 
in  the  physical  network  we  have  constructed  so  far,  there  are  n  edge  disjoint 
paths  connecting  s  and  t,  and  each  path  has  k  +  2  edges. 

Next,  we  add  k  pairs  of  nodes  {(yi,  21), . . . ,  (y^  zk)}  to  the  physical  network, 
where  each  node  pair  (yn  Zj)  is  connected  by  an  edge.  Finally,  we  connect  :rj_1 
to  y.j  and  Zj  to  .r'  for  all  %  £  {1, . .  . .  n},  j  £  {1, . . . ,  k}. 

•  Lightpath  Routing: 

We  will  define  a  route  in  the  physical  topology  for  each  lightpath  ep.  Each  route 
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k  will  contain  k  +  2  segments: 


S  xn  X 


J  ^  .  .  .  -W  'W 


Segments  s  xf  and  x\  t  will  take  the  direct  edges  s  — *•  and  ,r{  — >  t 
respectively  as  their  routes.  The  routes  for  other  segments  depend  on  whether 
e,  is  in  Rp 

-  If  e(  G  f?.j,  the  route  for  x\  .  x)  is  xij_l  — >  y.}  -+  ^ 

-  If  e.j  0  /?,,  the  route  for  x'_,  -w  x'  is  aA  ,  — >  oL. 

J  J  1  J  J~L  J 

Figure  2-9  shows  the  physical  topology  and  lightpath  routing  constructed  from  a 
two-node  logical  topology  with  1Z  =  {{1,2},  {2}  ,  {1,  3}  ,  {1}}. 

By  construction,  all  fibers  except  {{yu  z{), . . . ,  (yk,  zfc)}  are  used  by  at  most  one 
lightpath.  Also,  a  lightpath  e,-  uses  fiber  (;/,.  z,)  if  and  only  if  ^  is  in  Rt.  In  other 
words,  there  are  exactly  k  fibers,  (yly  z{), ....  (yk.  zk),  that  are  used  by  multiple  light- 
paths,  and  each  fiber  (y{,  z<)  is  used  by  the  lightpaths  in  R, .  □ 


2.7.2  Proof  of  Theorem  2.2 


Let  MinSPSst  be  the  size  of  the  minimum  survivable  path  set  between  the  logical 
nodes  s  and  t.  Theorem  2.2  describes  the  relationship  between  the  value  of  MinSPSst 
and  the  relaxed  Max  Flow,  MaxFlow{{,  between  the  two  nodes: 

Theorem  2.2:  MinSPSst  <  ,  R  |  +  1. 

st  —  logMaxFIow^ 


Proof.  Let  Vst  and  be  the  set  of  logical  s  —  t  paths  and  the  set  of  physical  links 
respectively.  For  each  s  —  t  path  p  G  Vst,  denote  the  set  of  physical  links  used  by  p 
as  Lip).  We  first  construct  a  bipartite  graph  on  the  node  set  (V,t:  EP).  There  is  an 
edge  (p,  l)  G  Vst  x  EP  if  and  only  if  the  s  —  t  path  p  does  not  use  physical  link  l,  i.e., 
I  L(p).  In  other  words,  the  edge  (p.l)  is  in  the  bipartite  graph  if  and  only  if  the 
path  p  survives  the  failure  of  physical  link  /. 
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lightpath 

physical  fiber 


Figure  2-9:  The  physical  topology  and  lightpath  routing  on  three  lightpaths  between  two 
nodes  .s  and  t,  and  lightpath-sharing  relationship  R  =  {{1, 2}  ,  {2}  ,  {1, 3}  {1 }} 


logical 
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We  prove  the  theorem  by  explicitly  constructing  a  survivable  path  set  with  size 
at  most  lo^a^oWR  +  T  using  the  bipartite  graph.  Algorithm  SPSgreedy  describes 
a  greedy  algorithm  that  constructs  the  path  set  by  repeatedly  selecting  s  paths 
and  removing  physical  links  whose  failures  the  selected  path  can  survive.  When  the 
algorithm  terminates,  every  physical  link  failure  is  survived  by  a  selected  path  in  the 
output.  Therefore,  the  algorithm  gives  a  survivable  path  set. 


Algorithm  2  SPSgreedy _ 

1:  P  ■  =  0.  S  :=  Ep 
2:  while  S  0:  do: 

-  Select  p  G  Vst  with  the  largest  node  degree  in  the  bipartite  graph. 

-P:=Pu{p},S:=S\L(p) 

-  Remove  nodes  p  and  L(p)  from  the  bipartite  graph. 

3:  Return  P 


The  key  observation  for  this  algorithm  is  that,  every  iteration  of  the  algorithm 
removes  a  constant  fraction  of  remaining  nodes  in  EP.  We  state  this  result  as  the 
following  lemma: 

Lemma  2.13  Let  B‘  be  the  bipartite  graph  at  the  beginning  of  the  ilh  iteration  of  the 
algorithm .  where  the  remaining  node  sets  for  EP  and  Vst  are  E‘P  and  V\t  respectively. 
There  exists  a  node  in  Vlf  with  node  degree  at  least  1-;  where  a  is  the  optimal 

value  for  the  formulation  MaxFlowR . 

Proof  Suppose  {ff\p  G  Vst}  is  the  optimal  solution  for  MaxFlowsRt,  such  that: 


S  K  =  a- 

p£Vst 

For  the  purpose  of  analysis,  for  each  edge  ( p ,  l )  6  x  ElP  in  the  bipartite  graph,  we 
assign  the  edge  a  weight  f*. 

For  each  node  v  in  the  bipartite  graph,  let  d(v)  be  its  node  degree,  and  we  define 
its  weight  w(v)  to  be  sum  of  the  wreight  of  its  incident  edges.  Then  we  have: 


^  d(p) 


E  s;  < «. 

Pevst 


(2.10) 
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For  each  node  /  in  E’Pl  its  neighbors  in  V%  are  the  same  as  its  neighbors  in  Vst, 
since  otherwise  it  should  have  already  been  removed  from  the  bipartite  graph.  Its 
node  weight  is: 

«'(')=  E  E  /; 

■(><  P‘,  :/•//.!/•}  /.(/I) 

=  E4‘-  E  n 

pe'Pst  p£'Pst'-l&L(p) 

>  a  —  1,  since  f  *  <  1,  by  Equation  (2.1). 

p:l£L(p) 

Therefore  the  total  weight  for  the  nodes  in  A},  is  at  least  |A’p|(ct  —  1),  which 
implies: 

w(p)  >  \ElP\(a  -  1).  (2.11) 

PeVl, 

Let  dmax  be  the  largest  node  degree  among  the  nodes  in  V\t.  We  have: 

dmax  >  — —  ,  by  Equations  (2.10)  and  (2.11). 

Tpy 

Therefore,  the  set  Vlsl  contains  a  node  with  degree  at  least  IWlLhiii  □ 


As  a  result  of  Lemma  2.13,  every  iteration  of  the  algorithm  removes  a  fraction  of 
9L^-  nodes  of  E),  from  the  bipartite  graph.  Therefore,  after  the  ith  path  is  selected,  the 
number  of  nodes  in  EP  that  remain  in  the  bipartite  graph  is  at  most  (1  —  ~-Y\Ep\. 
The  algorithm  will  terminate  as  soon  as: 

(1  —  — - -)2\Ep\  <  1.  which  implies  i  >  — . 

a  log  a 

Therefore,  the  algorithm  returns  a  survivable  path  set  with  size  [log0  |-Ep|J  +1.  □ 
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2.7.3  Proof  of  Theorem  2.7 


Theorem  2.7:  Given  the  lightpath  routing  for  a  multi-layer  network  Q  —  (GP,Gif), 
finding  its  Minimum  Cross-Layer  Spanning  Tree  is  NP-hard. 

Proof.  We  prove  the  theorem  by  constructing  a  reduction  from  the  NP-Hard  Mini¬ 
mum  Label  Spanning  Tree  problem  [26]: 

Minimum  Label  Spanning  Tree:  Given  a  graph  G  =  ( V.  E),  and  a  set  of  labels 
C  =  { L  i , . . . ,  Lrn}.  Each  edge  e  £  E  is  associated  with  a  set  of  labels  Ce  C  C.  Find 
a  spanning  tree  T  of  G  with  minimum  number  of  labels,  that  is,  the  value  |  UeeP  £e| 
is  minimized. 

Given  an  instance  of  the  Minimum  Label  Spanning  Tree  problem,  we  will  construct 
an  instance  of  the  Minimum  Cross-Layer  Spanning  Tree  problem,  which  consists  the 
the  physical  topology  GP  —  (VP,EP),  logical  topology  GL  —  ( VL ,  EL )  and  lightpath 
routing. 

Logical  Topology:  The  logical  topology  Gp  is  the  same  as  the  graph  G  in  the 
Minimumm  Label  Spanning  Tree  problem. 

Physical  Topology:  The  physical  topology  contains  all  the  nodes  in  the  logical 
topology.  In  addition,  for  each  label  L,  €  C,  we  add  a  pair  of  nodes  pi  and  qt,  with  a 
physical  link  (p, .  q, )  connecting  the  two  nodes. 

Next,  for  each  logical  link  (s,t)  £  EL ,  we  denote  hf  =  s  and  hf  =  t.  Between  hf 
and  hf,  we  insert  a  sequence  of  2  *  j£|  —  1  physical  nodes  jrf ,  hf, ....  hsf_1,  xf 
and  construct  a  physical  path  between  the  two  nodes:  hf  xf  — >  hf  . . .  xfc 

hif 

Finally,  for  each  label  L,  £  C ,  ( s.t )  £  G  and  logical  link  ( s.t )  £  EL,  we  add  two 
physical  links  (hf  (qt.  hf). 

Lightpath  Routing:  For  each  logical  link  (s.t),  the  lightpath  routing  for  (s.t) 
consists  of  |£|  segments  $  hf  -w  hf . . .  h(f_[  t. 

For  each  i  £  {1,  —  |£|},  the  route  for  each  segment  hfl  ***  hf  depends  on 
whether  the  edge  (s.t)  has  label  L,  in  the  original  Minimum  Label  Spanning  Tree 
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instance.  If  the  edge  has  label  Ln  then  the  segment  hf_1  hf  takes  on  the  route 
hf_  i  — >  pi  —>  q.j  — >  hf.  Otherwise,  hf_1  -w  /if4  takes  on  the  route  hsif1  — *•  .rf  — >  /if. 

Under  this  lightpath  routing,  only  physical  links  of  the  form  (pi:  qft  can  be  shared 
by  mutilple  logical  links.  Other  physical  links  can  be  used  by  at  most  one  logical  link. 
We  call  the  first  kind  of  physical  links  non-exclusive  physical  links,  and  the  others 
exclusive  physical  links. 

Therefore,  each  segment  hf__l  hf  traverses  exactly  two  exclusive  physical  links, 
and  in  addition  one  non-exclusive  link  if  the  edge  (s,  t)  has  label  Lt  in  the  correspond¬ 
ing  Minimum  Label  Spanning  Tree  problem.  In  other  words,  each  logical  link  (s.t) 
traverses  2|£|  exclusive  physical  links  and  |£,f  non-exclusive  physical  links,  where 
Cst  is  the  set  of  labels  associated  with  (s,  t). 

An  example  of  the  reduction  is  shown  in  Figures  2-10  and  2-11. 

We  prove  the  following  lemma,  which  implies  that  finding  the  minimum  label 
spanning  tree  reduces  to  finding  the  minimum  cross-layer  spanning  tree. 

Lemma  2.14  Let  a.  be  the  number  of  labels  associated  with  the  optima, l  solution  for 
the  Minimum  Label  Spanning  Tree  instance,  and  let  ft  be  the  number  of  physical 
links  in  Minimum  Cross-Layer  Spanning  Tree  instance  under  the  reduction.  Then 
ft  =  2(n  —  1)|£|  -f  o. 

Proof.  Since  at  least  n  —  1  logical  links  must  survive  if  a  cross-layer  spanning  tree 
survives,  and  each  logical  links  uses  exactly  2|£|  exclusive  fibers,  every  cross-layer 
spanning  tree  contains  at  least  2 (n  —  l)j£|  exclusive  fibers. 

First,  suppose  T  is  the  minimum  label  spanning  tree  in  the  Minimum  Label  Span¬ 
ning  Tree  problem  with  a  labels.  In  the  corresponding  Minimum  Cross-Layer  Span¬ 
ning  Tree  problem,  T  is  also  a  spanning  tree  for  the  logical  topology  where  each 
logical  link  (s,t)  £  T  traverses  2|£|  exclusive  physical  links  and  |£,f  non-exclusive 
physical  links.  Note  that  the  logical  link  (s,t)  uses  the  non-exclusive  link  (/);.  q;)  if 
and  only  if  the  edge  (.s,  t)  is  associated  with  label  Li  in  the  Minimum  Label  Spanning 
Tree  problem.  Therefore,  the  set  of  non-exclusive  links  used  by  (s,  t)  corresponds  to 
the  set  of  labels  associated  with  the  edge  (s,  t )  in  the  Minimum  Label  Spanning  Tree 
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instance.  This  implies  that  the  set  of  non-exclusive  links  used  by  all  logical  links  in 
T  is  exactly  the  set  of  labels  associated  with  T  in  the  Minimum  Label  Spanning  Tree 
problem.  Therefore,  the  logical  links  in  T  use  a  total  of  2 (n  -  1)|£|  exclusive  links 
and  q  non-exclusive  links.  Since  T  is  a  logical  spanning  tree,  this  set  of  physical  links 
contains  a  cross-layer  spanning  tree.  As  a  result,  we  have  3  <  2{n  —  1)|£|  +  a. 

a  (L,.  L2) 

. '.""//A 

b  (L3) 

Figure  2-10:  Minimum  Label  Spanning  Tree  instance. 

Now,  assume  that  3  <  2 (n  -  1)|£|  +  a.  The  minimum  cross-layer  spanning  tree 
S  therefore  contains  less  than  a  non-exclusive  links.  Let  W  be  the  set  of  logical 
links  that  survive  if  only  the  phyiscal  links  in  S  survive.  Since  W  is  a  connected 
subgraph  of  EL,  it  contains  a  logical  spanning  tree  T  that  uses  less  than  a  non¬ 
exclusive  links.  Since  the  set  of  non-exclusive  links  used  by  T  corresponds  to  the  set 
of  labels  associated  with  the  spanning  tree  T  in  the  Minimum  Label  Spanning  Tree 
problem,  this  contradicts  the  fact  that  the  minimum  label  spanning  tree  has  a  labels. 
Therefore,  we  have  3  >  2(t?  —  1)|£|  +  a.  □ 

Because  of  Lemma  2.14,  finding  the  minimum  label  spanning  tree  can  be  reduced 
to  finding  the  minimum  cross-layer  spanning  tree  under  the  reduction.  □ 


2.7.4  Proof  of  Theorem  2.9 


Theorem  2.9:  Computing  the  Weighted  Load  Factor  for  a  lightpath  routing  is  NP- 
Hard  even  if  the  weight  assignment  w„t  for  the  logical  links  is  fixed. 

Proof.  We  construct  a  reduction  from  the  NP-Hard  Uniform  Sparsest  Cut  fl) problem: 


Uniform  Sparsest  Cut: 

Given  an  undirected  graph  G  —  (V,  E),  compute  the  value  of  min 

SCVL 


|g(S)l 
|.S||V— SI- 


71 


a 


£ . 'A 

b 

(a)  Logical  Topology 


(b)  Physical  Topology 


(c)  Lightpath  Routing 

Figure  2-11:  Minimum  Cross-Layer  Spanning  Tree  instance. 


Given  the  graph  G  —  (V,  E)  in  an  instance  of  Uniform  Sparsest  Cut  problem, 
we  construct  an  instance  of  the  Weighted  Load  Factor  problem,  with  the  weight 
assignment  wst  fixed,  such  that  the  optimal  values  of  the  two  problems  are  identical. 
Without  loss  of  generality,  we  assume  G  is  connected.  We  will  construct  a  physical 
topology,  logical  topology,  lightpath  routing  f8{  and  weight  assignment  wxt  of  the 
logical  links  based  on  the  graph  G  =  (V.  E)  in  the  Uniform  Sparsest  Cut  instance. 

•  Logical  Topology:  The  logical  topology  is  a  complete  graph  on  the  vertex  set 
VL  =  V.  Each  logical  link  (s,  t)  has  weight  wst  —  1. 

•  Physical  Topology:  The  physical  topology  is  a  complete  graph  on  the  vertex 
set  Vp  =  V  U  {u,u},  where  u  and  v  are  two  new  vertices  not  in  V. 
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•  Lightpath  Routing:  For  each  logical  link  (s.£),  if  | s,t)  is  an  edge  of  G  in 
the  Uniform  Sparsest  Cut  instance,  the  logical  link  takes  on  the  physical  route 
s  — >  u  — >  V  — >  t.  Otherwise,  it  takes  on  the  physical  route  s  — >•  t. 


Let  S  be  an  arbitrary  subset  of  V.  Let  SSc(S)  be  the  cut  set  of  S  with  respect 
to  graph  G  of  the  Uniform  Sparsest  Cut  instance,  and  let  SL(S)  be  the  cut  set  of  S 
with  respect  to  the  logical  topology  Gp,  which  is  a  complete  graph  on  VL  —  V.  We 
claim  the  following  equality: 


iwg)l 

\S\\V-S\ 


max 
( i,j)eEP 


E  Wsifsi 

(s,t)e5L(S) _ 

E  wst 

(i>,i)eSL(S) 


(2.12) 


This  is  because  every  physical  link  not  attached  to  u  or  v  is  used  by  at  most  one  logical 
link.  In  addition,  any  logical  link  that  uses  a  physical  link  in  the  form  u)  or  x), 
for  any  x  in  VP,  also  uses  (a.  v)  in  the  lightpath  routing.  Since  G  is  connected,  for  each 
^  G  V,  there  is  at  least  one  logical  link  in  5$c(S)  that  uses  the  physical  link  (u,v). 
Therefore,  for  any  S  C  V'/, ,  the  physical  link  (u.  v)  carries  the  largest  number  of  logical 
links  in  SL(S).  Since  a  logical  link  uses  (u,  v)  if  and  only  if  the  corresponding  edge 
exists  in  G,  the  number  of  logical  links  in  5L(S)  using  (u,v)  is  |55C(S')|.  Therefore, 
the  fraction  of  weight  carried  by  the  physical  link  (u.  v)  is  This 

implies  the  sparsest  cut  value  equals  the  Weighted  Load  Factor  value.  □ 


2.7.5  Proof  of  Theorem  2.10 

Let  MCLC  and  MCLCR  be  the  optimal  objective  values  for  formulation  Mmclc  and 
its  linear  relaxation  M^CLC  respectively.  And  let  WLF  be  the  Weighted  Load  Factor 
of  the  lightpath  routing.  Theorem  2.10  declares  the  following: 

Theorem  2.10:  MCLCR  <  WLF  <  MCLC. 
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Proof.  Recall  that  the  ILP  formulation  for  MCLC  is: 


Mmclc  :  Minimize  Y^  //,/.  subject  to: 

ft#' 

£  V.JS.  V(s,l)eii  (2.13) 

X  d«  >  i  <2'14) 

n£VL 

d0  =  0,  dn.  y.tj  £  {0, 1}  ,  Vn  G  14,  (b  j)  £^p 

where  fsl  are  binary  constants  such  that  logical  link  (. s,t )  traverses  physical  fiber 
(ii.j)  if  and  only  if  f?j  =  1. 

For  the  rest  of  the  proof,  for  any  subset  S  of  the  logical  nodes  VL,  we  denote  <){S) 
to  be  the  cut  set  of  5,  i.e.,  the  set  of  logical  links  with  exactly  one  end  point  in  S. 

We  first  prove  that  MCLCR  <  WLF.  To  do  this,  we  construct  the  dual  [17]  of 
Mmclc: 


Mmclc  •  Maximize  q,  subject  to: 

X]  9slft-<  1,  v(z,  j)  e  Fp  (2.15) 

(s,t)eBL 

q+  Y  9at-  Y  y1*-0"  (2-16) 

( s.l)eEL  (Ls)£El 

q,gsl>  0,  V(.s.  /  )  C-  Kl 

The  variables  y}J  in  the  primal  M^CLC  correspond  to  Constraint  (2.15)  in  the 
dual.  Similarly,  the  variables  ds,  where  s  =7  0,  in  the  primal  correspond  to  Constraint 
(2.16)  in  the  dual.  For  Constraints  (2.13)  and  (2.14)  in  the  primal,  the  corresponding 
variables  in  the  dual  are  gst  and  q  respectively.  We  can  interpret  the  variable  gst  as 
the  flow  value  assigned  to  logical  link  (s.f).  Then  Constraint  (2.15)  requires  that  the 
total  flow  on  each  physical  fiber  be  at  most  1.  Constraint  (2.16)  requires  at  least  q 
units  of  incoming  flow  for  all  nodes  other  than  node  0.  Intuitively,  the  dual  program 
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tries  to  maximize  the  value  q  such  that  the  node  0  sends  at  least  q  units  of  flow  to 
every  other  node,  subject  to  the  capacity  constraint  for  each  fiber. 

We  first  prove  Lemma  2.15,  which  will  be  used  to  establish  the  lower  bound  on 
WLF. 

Lemma  2.15  Let  (q,g)  be  a  feasible  solution  for  and  let 

<;(■?)=  E  E  'j“ 

(s,t)0EL:s&S,t€S  0  ,t)£EL:seS,L&S 

be  the  net  flow  into  the  cut  set  S.  Then  g(S)  >  kq,  for  any  S  C  VL\  {0}  with  k  =  |5|. 

Proof.  Consider  an  arbitrary  node  set  S  C  Vl\  {0},  and  let  k  —  |5|.  We  prove  by 
induction  on  k  that  g(S)  >  kq. 

•  Base  case:  k  =  0:  In  this  case,  S  is  an  empty  set  and  g(S)  >  kq  trivially. 

•  Inductive  case:  Suppose  for  some  0  <  k  <  \  VL\  -  l,g(S)  >  kq  for  all  S  with 
| S'!  =  k  and  0  £  S.  Now  let  S'  be  any  subset  of  k  +  1  nodes  that  does  not 
contain  node  0,  let  b  be  an  arbitrary  node  in  S',  and  let  S'b  —  S'\  {5} .  Since 
Sh  is  a  set  of  k  nodes,  by  induction  hypothesis,  we  have  g(S'b)  >  kq.  It  follows 
that: 

9(S')=g{S'l)+  H,b-  E  9U 

(t.b)EEi,  {b,t.)<EEL 

>  g(Sh )  +  q,  by  Constraint  (2.16) 

>  (k  +  l)q. 


By  induction,  g(S)  >  kq  VS  C  VL\ {0}  and  k  =  |5|.  □ 

Now  we  are  ready  to  prove  that  MCLCR  <  WLF.  Given  an  optimal  solution 
(q*,g*)  to  the  formulation  M ^q1'^ .  the  value  of  gst  is  a  feasible  assignment  of  the 
variable  wst  in  the  Weighted  Load  Factor  formulation  MWlf-  The  corresponding 
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objective  value  for  this  assignment  is: 


by  Lemma  2.15 
by  Constraint  (2.15) 

which  implies  WLF  >  q* .  On  the  other  hand,  by  Duality  Theorem  [17],  the  optimal 
value  for  M^CLC  is  exactly  q* .  Therefore  we  have  MCLCR  <  WLF. 

Next,  we  prove  that  WLF  <  MCLC.  Let  C  be  the  set  of  physical  fibers  that 
constitute  a  Min  Cross  Layer  Cut,  and  let  a  be  an  arbitrary  node  in  the  logical 
network.  Let  Sc  C  VL  be  the  set  of  nodes  reachable  from  a  after  C  has  been  removed 
from  the  physical  network.  It  follows  that  all  logical  links  in  5 (Sc)  use  fibers  in  C. 

Let  w  be  the  weight  function  on  EL  that  achieves  the  optimal  Weighted  Load 
Factor,  and  let  w(Sc )  be  the  total  weight  of  the  logical  links  in  S($c)-  Also,  let 
(i*,j*)  be  the  physical  fiber  that  carries  the  most  weight  for  lightpaths  in  5 (Sc)-  The 
definition  of  WLF  implies  that: 


E  <?' 

(s,t)eS(3) 

mm  — = - — - 7 

scvL,(i,j)<=EP  £  gsL  f?j 

fc,L)e5(S) 


> 


E 

(s.i)eS(S) 


>q\ 


WLF  =  min 


E  wst 

W)eS(S) 

st 


< 


scvL,{i.j)eBP  E 

E  wst 

(a,t)etf(5g) 

E  ws  ' 

(s,t)£d(Sc) 


Next,  since  all  logical  links  in  5 (Sc)  use  fibers  in  C,  we  have: 


(a,t)€S(Sc)  (i,j)€C  (s,t)GS(Sc) 

<|C|  V 

(...i)ei(Sc) 


(2.17) 


(2.18) 
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Finally,  combining  inequalities  (2.17)  and  (2.18),  we  have: 


E  u,st 

Wlf  <  _W(fe)  <  |C|  _  MCLC 

(s,t)eS(sc) 

□ 
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Chapter  3 


Assessing  Reliability  for  Layered 
Network  under  Random  Physical 
Failures 

3.1  Introduction 

The  study  of  cross-layer  survivability  in  Chapter  2  is  based  on  a  deterministic  failure 
model,  where  survivability  is  defined  by  one  (smallest)  set  of  physical  failures  that 
disconnect  the  logical  topology.  In  this  chapter,  we  extend  our  study  to  the  random 
physical  failure  model  where  all  physical  links  fail  independently  with  probability 
p.  This  probabilistic  failure  model  represents  a  snapshot  of  a  network  where  links 
fail  and  are  repaired  according  to  some  Markovian  process.  Hence,  p  represents  the 
steady-state  probability  that  a  physical  link  is  in  a  failed  state.  The  cross-layer 
reliability  of  the  network,  defined  to  be  the  probability  that  the  logical  topology 
stays  connected  under  the  random  physical  failures,  is  a  natural  generalization  of  the 
single-layer  all-terminal  reliability,  which  has  been  extensively  studied  in  the  literature 
(see  [32]  for  example).  However,  as  shown  in  the  previous  chapter,  the  structural 
properties  in  layered  networks  are  significantly  different  from  single-layer  networks. 
This  makes  many  of  the  existing  approaches  either  inapplicable  or  inefficient  in  the 
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multi-layer  setting.  In  particular,  in  additional  to  the  physical  and  logical  topologies, 
the  underlying  lightpath  routing  of  a  layered  network  determines  the  way  the  logical 
network  is  affected  by  the  physical  failures,  and  therefore  plays  an  important  role  in 
the  overall  reliability  of  the  network. 

For  example,  in  Figure  3-1,  the  logical  topology  consists  of  two  parallel  links  be¬ 
tween  nodes  $  and  t.  Suppose  every  physical  link  fails  independently  with  probability 
p.  The  first  lightpath  routing  in  Figure  3-1  (c)  routes  the  two  logical  links  using 
link-disjoint  physical  paths  (s,  l,2,f)  and  (s,2,  3,t).  Under  this  routing,  the  logical 
network  will  be  disconnected  with  probability  (1  -  (1  -p)3)2.  On  the  other  hand,  the 
second  lightpath  routing  in  Figure  3-1  (d),  which  routes  the  two  logical  links  over  the 
same  shortest  physical  route  (s,2,t),  has  failure  probability  2 p  -  p2.  While  disjoint 
path  routing  is  generally  considered  more  reliable,  it  is  only  true  in  this  example  for 
small  values  of  p.  For  large  p  (e.g.  p  >  0.5),  the  second  lightpath  routing  is  actually 
more  reliable.  Therefore,  whether  one  lightpath  routing  is  better  than  another  may 
depend  on  the  value  of  p.  In  some  cases,  there  may  exist  a  lightpath  routing  with 
lower  failure  probability  over  all  values  of  p,  as  shown  in  Figure  3-1  (e). 

Therefore,  in  order  to  design  a  reliable  layered  network,  it  is  important  to  de¬ 
velop  a  better  understanding  of  the  role  of  lightpath  routings  in  cross-layer  reliability. 
To  achieve  this,  we  will  extend  the  polynomial  expression  for  single-layer  network 
reliability  to  the  layered  setting.  In  Section  3.3  we  define  the  cross-layer  failure  poly¬ 
nomial.  which  provides  a  formula  for  network  reliability  as  a  function  of  the  link 
failure  probability.  Hence,  the  cross-layer  reliability  can  be  estimated  by  approxi¬ 
mating  the  coefficients  of  the  polynomial.  Exploiting  this  relationship,  in  Sections 
3. 4-3. 7  we  develop  Monte  Carlo  based  estimation  methods  that  approximates  cross¬ 
layer  reliability  with  provable  accuracy.  Our  method  is  not  tailored  to  a  particular 
probability  of  link  failure,  and  consequently,  it  does  not  require  resampling  in  order 
to  estimate  reliability  under  different  values  of  link  failure  probability.  That,  is,  once 
the  polynomial  is  estimated,  it  can  be  used  for  any  value  of  link  failure  probability 
without  resampling.  Our  approach  is  immediately  applicable  to  single-layer  networks 
as  well. 
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1 


(a)  Physical  Topology 


(b)  Logical  Topology 


Figure  3-1:  Example  of  disjoint,  shortest  and  optimal  routings:  Non-disjoint  routings  can  sometimes 
be  more  reliable  than  disjoint  routings.  Optimally  reliable  routings  over  all  values  of  p  sometimes 
exist. 


Another  interesting  property  of  the  polynomial  expression  for  reliability  is  that 
its  coefficients  contain  the  structural  information  of  the  cross-laver  topology,  espe- 
cially  lightpath  routing.  Consequently,  it  gives  clear  insights  on  how  lightpath  routing 
should  be  designed  for  better  reliability.  This,  together  with  our  estimation  algorithm, 
enables  us  to  revisit  the  network  design  problem  from  the  viewpoint  of  network  relia¬ 
bility.  In  Section  3.8  we  will  investigate  the  connection  between  cross-layer  reliability 
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and  Min  Cross  Layer  Cut,  the  survivability  metric  used  in  Chapter  2,  and  study  the 
performance  of  the  lightpath  routing  algorithms  presented  in  Chapter  2  under  this 
random  failure  model.  We  will  briefly  discuss  several  extensions  to  our  failure  model 
in  Section  3.9,  and  how  the  reliability  estimation  algorithms  can  be  applied  to  these 
new  settings.  The  insights  developed  in  this  chapter,  in  particular,  the  study  of  the 
failure  polynomial,  lays  the  groundwork  for  our  studies  in  the  next  two  chapters, 
which  focus  on  designing  networks  to  maximize  reliability. 

In  Appendix  3.11.3,  we  briefly  discuss  an  alterative  approach  based  on  importance 
sampling  [97]  to  assess  reliability  of  layered  networks,  and  constrast  it  with  our  failure 
polynomial  approach. 


3.2  Previous  Work 

The  network  reliability  estimation  problem  has  been  extensively  studied  in  the  single¬ 
layer  setting.  Valiant  [114]  first  showed  that  computing  reliability  in  the  single-layer 
setting  is  #P- complete1.  Provan  and  Ball  |87]  later  showed  that  it  is  #P~complete 
even  to  approximate  the  reliability  up  to  e  relative  accuracy.  Due  to  the  inherent 
complexity,  most  of  the  previous  works  in  this  context  focused  on  approximating 
the  actual  reliability.  Although  there  are  some  works  aimed  at  exact  computation 
of  reliability  through  graph  transformation  and  reduction  [27,73,83,86,98,106.107, 
111],  the  applications  of  such  methods  are  highly  limited  since  they  are  targeted  to 
particular  topologies.  Furthermore,  those  methods  cannot  be  used  for  estimating 
cross-layer  reliability  because  they  assume  independence  between  link  failures,  while 
failures  are  often  correlated  in  multi-layer  networks. 

Monte  Carlo  simulation  was  also  used  for  estimating  the  single-layer  reliability 
for  some  fixed  link  failure  probability.  Using  simulation,  the  reliability  can  be  ap- 

1  The  complexity  class  #P  is  the  counting  equivalent  of  NP.  While  a  decision  problem  in  NP  asks 
about  whether  a  feasible  solution  exists  subject  to  certain  constraints,  its  corresponding  problem  in 
#P  asks  about  how  many  of  such  feasible  solutions  exist. 

A  problem  is  #P-complete  if  and  only  if  it  is  in  #P,  and  every  problem  in  #P  can  be  reduced 
to  it  in  polynomial-time.  An  algorithm  that  solves  a  #P-complete  problem  in  polynomial  time  will 
imply  P  -NP,  and  is  therefore  unlikely  to  exist. 


82 


proximated  to  an  arbitrary  accuracy,  but  the  number  of  iterations  required  by  direct 
simulation  tends  to  be  very  large  when  the  failure  probability  is  small.  There  are 
various  algorithms  designed  specifically  to  optimize  for  this  case  [41,42,62,63].  How¬ 
ever,  each  run  of  these  algorithms  only  estimates  the  reliability  for  a  given  link  failure 
probability;  and  the  algorithm  must  be  repeated  for  a  different  failure  probability. 

Another  approach  is  to  use  a  polynomial  expression  for  reliability  [12]  and  es¬ 
timate  every  coefficient  appearing  in  the  polynomial;  where  the  reliability  can  be 
approximated  using  the  estimated  coefficients.  The  advantage  of  this  approach  over 
simulation  is  that  once  every  coefficient  is  estimated,  they  can  be  used  for  any  value 
of  failure  probability.  Most  of  the  works  in  this  context  have  focused  on  bounding 
the  coefficients  by  applying  subgraph  counting  techniques  and  results  from  combi¬ 
natorics  [24,31,53,89,94].  This  approach  is  computationally  attractive,  but  its  esti¬ 
mation  accuracy  is  not  guaranteed.  Some  previous  works  studied  the  regime  of  low 
failure  probability  by  focusing  on  small  cut  sets  [2,16].  In  [82],  a  random  sampling 
technique  is  used  to  enhance  those  bounding  results.  In  particular,  [82]  considers 
another  form  of  the  polynomial  used  in  [13],  and  estimates  some  of  the  coefficients 
by  enumerating  spanning  trees  in  the  graph.  These  estimates  are  used  to  improve 
the  algebraic  bound  in  [13].  This  approach  is  relevant  to  our  work  in  that  it  tries  to 
approximate  the  coefficients  in  the  polynomial  through  random  sampling.  However, 
the  algorithm  proposed  in  [82]  is  based  on  sampling  spanning  trees  in  the  network, 
which  is  not  immediately  applicable  to  our  multi-layer  setting  because  the  properties 
of  cross-layer  spanning  trees  is  vastly  different  from  their  single-layer  counterparts; 
and  sampling  minimum  spanning  trees  in  layered  networks  becomes  a  much  more 
difficult  problem,  as  discussed  in  Section  2.2.3. 

In  this  chapter,  we  take  a  different  approach  from  [82]  by  sampling  cross-layer 
cuts.  Even  though  finding  minimum  cross-layer  cuts  is  an  NP-Hard  problem,  our 
cut-based  approach  is  feasible  in  the  cross-layer  setting  due  to  the  following  reasons: 

1.  I  he  size  of  minimum  cross-layer  cut  is  bounded  above  by  the  minimum  logical 
node  degree,  which  is  usually  a  constant.  In  practice,  it  is  often  easier  to  find 
or  enumerate  minimum  cross-layer  cuts  than  spanning-trees,  which  is  lower 
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bounded  by  the  number  of  logical  nodes,  as  shown  in  Theorem  2.4. 


2.  Except  for  cross-layer  cuts  of  small  size,  it  can  be  shown  that  cross-layer  cuts  are 
abundant  in  a  layered  network  in  general.  This  makes  cut  sampling  a  promising 
approach. 

In  Section  3.4,  we  will  develop  a  reliability  estimation  algorithm  based  on  the 
above  insight.  Before  that,  we  first  formally  describe  our  model  and  provide  some 
mathematical  background. 


3.3  Model  and  Background 

A  multi-layer  network  is  modelled  by  a  logical  topology  GL  -  (VL,  EL)  built  on  top 
of  the  physical  topology  GP  =  (V>.  EP)  through  a  lightpath  routing,  where  V  and 
E  are  the  set  of  nodes  and  links  respectively.  The  lightpath  routing  is  denoted  by 
/  =  [/^,  (?;,  j)  £  EP ,  (s.  t)  £  El],  where  /?■  takes  the  value  1  if  logical  link  (s,  t )  is 
routed  over  physical  link  (E  j),  and  0  otherwise. 

We  consider  a  random  failure  model  where  the  state  of  each  physical  link  (i,j)  £ 
EP  is  represented  by  the  0-1  random  variable  xtJ ,  which  equals  0  if  and  only  if  the 
physical  link  (i,  j )  fails.  Let  H  =  2Ep  be  the  family  of  all  subsets  of  the  physical  links 
EP.  We  define  a  network  state  S  £  H  as  the  set  of  physical  links  that  fail,  that  is, 

S  =  :  Xij  -  0}. 

Each  physical  link  fails  independently  with  probability  p.  If  a  physical  link  (i,  j) 
fails,  all  the  logical  links  (s.t)  carried  over  (i,j)  (i.e.,  (s.t)  such  that  f-j  =  1)  also 
fail.  A  network  state  S  is  called  a  cross-layer  cut  if  and  only  if  the  failure  of  the 
physical  links  in  S  causes  the  logical  network  to  be  disconnected.  Let  JR  be  a  0-1 
random  variable  on  7~i  such  that  R(S)  —  1  if  and  only  if  S  is  not  a  cross-layer  cut. 
Then,  the  reliability  of  the  layered  network  is  defined  to  be  Pr{R  —  1).  Similarly,  the 
unreliability  is  defined  to  be  Pr(R  =  0). 
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3.3.1  Cross-Layer  Failure  Polynomial 

Since  cross-layer  reliability  generalizes  all-terminal  reliability  in  single-layer  networks, 
the  results  by  Valiant  [114]  and  Provan  et.  al.  [87]  immediately  imply  that  approx¬ 
imating  cross-layer  reliability  within  a  constant  factor  is  #P-Complete.  Hence,  our 
goal  in  this  chapter  is  to  develop  a  probabilistic  algorithm  that  can  accurately  esti¬ 
mate  the  reliability  with  high  probability.  As  we  discussed  in  Section  3.1,  the  relative 
reliability  performance  among  lightpath  routings  depend  heavily  on  the  value  of  p. 
Therefore,  when  comparing  lightpath  routings,  it  is  often  necessary  to  assess  the  reli¬ 
ability  at  different  link  failure  probabilities  in  order  to  obtain  better  insight  from  the 
comparison.  For  this  purpose,  it  is  useful  to  develop  an  estimation  method  such  that 
once  an  estimation  is  made,  the  result  can  be  used  for  every  value  of  p.  Therefore,  we 
will  develop  an  algorithm  that  outputs  the  reliability  approximation  as  a  polynomial 
in  p,  so  that  comparing  different  lightpath  routings  at  different  link  failure  proba¬ 
bilities  is  trivial.  As  we  will  see  in  Section  3.8,  the  failure  polynomial  also  provides 
important  insights  to  the  design  of  lightpath  routings  for  better  reliability. 

The  polynomial  expression  for  reliability  presented  here  is  a  natural  extension  of 
the  single-layer  polynomial  [12]  to  the  cross-layer  setting.  Assume  that  there  are  m 
physical  links,  i.e.,  \EP\  —  m.  The  probability  associated  with  a  network  state  S 
with  exactly  i  physical  link  failures  (i.e.,  |S|  =  i)  is  // ( 1  -  Let  N,  be  the 

number  of  cross-layer  cuts  S  with  |S|  =  i,  then  the  probability  that  the  network  gets 
disconnected  is  simply  the  sum  of  the  probabilities  over  all  cross-layer  cuts,  i.e., 

rri 

F(p)  =  ^Nipi(l-pr-i.  (3.1) 

i=0 

Therefore,  the  failure  probability  of  a  multi-layer  network  can  be  expressed  as  a 
polynomial  in  p.  The  function  Ffp)  will  be  called  cross-layer  failure  polynomial  or 
simply  the  failure  polynomial  The  vector  [N0i . . . ,  A7m]  plays  an  important  role  in 
assessing  the  reliability  of  a  network.  In  particular,  one  can  simply  plug  the  value  of 
p  in  the  above  failure  polynomial  to  compute  the  reliability  if  the  values  of  Nt  are 
known. 
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Intuitively,  each  Nt  represents  the  number  of  cross-layer  cuts  of  size  i  in  the  net¬ 
work.  Clearly,  if  Nj,  >  0,  then  Nj  >  0,Vj  >  i  (because  any  cut  of  size  i  will  still 
be  a  cut  with  the  addition  of  more  failed  links).  The  smallest  i  such  that  Nh  >  0  is 
of  special  importance  because  it  represents  the  Min  Cross  Layer  Cut  (MCLC)  of  the 
network,  i.e.,  it  is  the  minimum  number  of  physical  link  failures  needed  to  disconnect 
the  logical  network.  Although  computing  the  MCLC  is  NP-Hard  [70],  for  practi¬ 
cal  purposes,  the  MCLC  of  a  network  is  typically  upper  bounded  by  some  constant, 
such  as  the  minimum  node  degree  of  the  logical  network.  Therefore,  for  the  rest  of 
the  chapter,  we  denote  the  MCLC  value  of  the  network  by  d,  and  assume  that  it  is 
a  constant  independent  of  the  physical  network  size.  It  is  important  to  note  that 
Ni  =  0.  V/  <  d.  and  the  term  NdPd{  1  —  p)m~d  in  the  failure  polynomial  dominates  for 
small  values  of  p.  Consequently,  if  a  lightpath  routing  tries  to  maximize  MCLC,  i.e., 
make  d  as  large  as  possible,  it  will  achieve  good  reliability  in  the  low  failure  probabil¬ 
ity  regime.  On  the  other  hand,  its  reliability  performance  is  not  guaranteed  in  other 
regimes.  This  will  be  further  discussed  in  Section  3.8,  where  we  study  the  reliability 
performance  of  the  lightpath  routing  algorithms  presented  in  Chapter  2.  A  similar 
observation  was  made  for  single-layer  networks  in  [20]. 

In  this  chapter,  we  focus  on  approximating  the  failure  polynomial.  We  will  use 
the  following  notions  of  approximation. 

Definition  3.1  (Relative  Approximation)  A  function  F(p)  is  an  e-approximation 
for  the  failure  polynomial  F(p)  if 

| F(p)  —  F(p)  |  <  eF(p),  for  all  p  G  [0, 1]. 

This  relative  error  is  typically  the  measure  of  interest  in  the  literature  of  reliability 
estimation.  However,  as  mentioned  above,  it  is  also  ffP- complete  to  approximate  the 
reliability  to  e  accuracy  [87].  Hence,  it  is  not  likely  that  there  exists  a  deterministic 
e-approximation  algorithm  requiring  reasonably  low  computation.  For  this  reason, 
our  estimation  focuses  on  the  following  probabilistic  approximation. 

Definition  3.2  ((e,  (^-approximation)  A  function  F(p)  is  an  (e,  5) -approximation 
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for  the  failure  polynomial  F(p )  if 


Pr 


IFip)- F(p)\<eF{p) 


>  (1  —  <5),  for  all  p  e  [0, 1]. 


In  other  words,  an  (e,  (^-approximation  algorithm  approximates  the  polynomial  to 
e  relative  accuracy  with  high  probability.  In  Sections  3.4  and  3.5,  we  will  present 
randomized  (e,  (^-approximation  algorithms  for  the  failure  polynomial. 


3.3.2  Monte  Carlo  Simulation 

Our  estimation  algorithm  is  based  on  Monte  Carlo  simulation  techniques.  The  central 
theme  of  such  Monte  Carlo  techniques  is  based  on  the  Estimator  Theorem ,  presented 
below.  Let  U  be  a  ground  set  defined  as  the  set  of  all  possible  events  (e.g.,  all 
network  states),  and  G  be  a  subset  of  U  (e.g.,  cross-layer  cuts).  Suppose  that  we 
want  to  estimate  |G|.  To  do  this,  the  Monte  Carlo  method  samples  an  element  e  from 
U  uniformly  at  random  for  T  times.  For  each  iteration  i,  let  X,  be  the  0-1  random 
variable  that  equals  1  if  and  only  if  the  sampled  element  e  G  G.  Then  the  random 
variable  Y  =  '‘^p1-1’  is  an  unbiased  estimator  of  \G\.  The  Estimator  Theorem 

states  that: 

Theorem  3.1  (Estimator  Theorem  [77])  Let  p  =  j^j.  Then  Y  =  ^-^p1  A‘  is  an 

(<?j .^-approximation  to  \G\,  provided  that 


In  other  words,  if  we  sample  from  the  ground  set  U  frequently  enough,  we  can 
estimate  |G'|  accurately  with  high  probability.  According  to  Theorem  3.1,  the  ratio  p, 
called  the  density  of  the  set  G,  is  inversely  proportional  to  the  required  sample  size 
T.  This  is  because  the  squared  coefficient  of  variation  of  Y,  defined  as  equals 

Therefore,  a  sample  size  T  in  the  order  of  ~  is  needed,  so  that  the  squared 
coefficient  of  variation  will  not  grow  with  j,  which  is  necessary  to  keep  the  relative 
error  small  [97]. 
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In  the  following  sections,  we  will  define  the  sets  G  and  U  in  various  ways  to 
ensure  high  p  value,  and  propose  polynomial-time  Monte  Carlo  methods  to  compute 
approximations  of  the  failure  polynomial. 


3.4  Estimating  Cross-Layer  Reliability 

The  most  straightforward  Monte-Carlo  method  to  estimate  network  reliability  is  via 
direct  simulation,  that  is,  collect  T  samples  from  the  universe  of  network  states  Ti, 
where  each  sample  is  obtained  by  simulating  each  physical  link  failure  with  probability 
p.  For  each  sample  Sn  compute  the  value  R,  of  the  random  variable  R{Si).  An 

y ft. 

unbiased  estimator  for  the  reliability  is  then  given  by  .  However,  such  an 

approach  has  the  following  drawbacks: 

1.  The  output  of  the  algorithm  is  the  reliability  value  for  a  particular  link  failure 
probability  p.  To  assess  reliability  at  a  different  link  failure  probability  p.  a  new 
round  of  sampling  is  required. 

2.  The  unreliability  of  the  network  R  can  be  arbitrarily  small  if  the  link  failure 
probability  p  is  sufficiently  small.  Therefore,  the  number  of  samples  required  to 
keep  relative  error  small,  which  is  in  the  order  of  A,  can  be  arbitrarily  large. 

Our  approach  to  approximating  the  cross-layer  failure  polynomial  is  to  estimate 
the  values  of  Nt  in  Equation  (3.1)  separately.  If  we  can  estimate  each  N,  with  suffi¬ 
cient  accuracy,  we  will  obtain  an  approximate  failure  polynomial  for  the  multi-layer 
network.  The  idea  is  formalized  in  the  following  theorem. 


Theorem  3.2  Let  Nt  be  an  e- approximation  of  N,  for  all  i  £  {1, . . . ,  to},  then  the 
function  F(p)  =  E"=o  ^'(1  -  p)m~l  is  an  e- approximation  for  the  failure  polyno¬ 
mial. 
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Proof.  For  all  0  <  p  <  1, 

7TI 

i' ip)  - F(p) I  <  Y, -  -WU  -/dm  ' 

i= 0 
rn 

i = 0 

=  eF(p). 


□ 

Corollary  3.3  Let  A  6e  an  algorithm  that  computes  an  (e,  yyy)- approximation  for 
each  Ni.  Then  A  gives  an  (e.  5) -approximation  algorithm  for  the  failure  polynomial. 

Proof.  By  the  union  bound,  the  probability  that  all  the  Af  estimates  are  e-approximate 
is  at  least  1  —  Xn=o  iF+I  —  1  —  A  By  Theorem  3.2,  A  gives  an  (e,  ^[-approximation 
algorithm  for  the  failure  polynomial.  □ 

Note  that  this  approach  can  be  considered  as  a  form  of  stratified  sampling  [97], 
where  the  sample  space  7 7  is  partitioned  into  multiple  subgroups  77,  and  the  con¬ 
ditional  expectations  E[R\Hi]  are  estimated  independently.  The  expectation  of  the 
random  variable  R  is  thus  given  by: 


KK-  ^>;;/7|7f;;/v(77,). 

Hi 

For  the  cross-layer  reliability  estimation  problem,  we  define  each  subgroup  77*  to 
be  all  possible  subsets  of  EP  with  size  7,  that  is,  77,;  —  {S  G  EP  :  \S\  =  i}.  It  follows 
that  r’ri'H;)  —  (™)pz(l  —  and  the  conditional  expectation,  E[R(S)  =  0|J  G  77,], 
is  simply  The  key  observation  is  that  the  conditional  probability  and  variance 
is  independent  of  the  link  failure  probability  [>■  As  a  result,  the  confidence  interval 
obtained  by  simulating  the  conditional  events  within  a  subgroup  is  independent  of  p. 
This  ensures  the  effectiveness  of  the  algorithm  even  if  p  is  small. 

As  a  result  of  Corollary  3.3,  it  suffices  to  obtain  a  (e,  yyy  [-approximation  for  each 
Ni.  In  the  remainder  of  this  section,  we  will  discuss  how  this  can  be  achieved. 
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3.4.1  Estimating  N, 


Let  Hi  be  the  family  of  all  subsets  of  EP  with  exactly  i  physical  links.  Clearly.  Nt 
is  the  number  of  subsets  in  Hi  that  are  cross-layer  cuts.  Hence,  one  can  compute 
the  exact  value  of  N,  by  enumerating  all  subsets  in  Hi  and  counting  the  number  of 
cross-layer  cuts.  However,  the  number  of  subsets  to  enumerate  is  ("l),  which  can  be 
prohibitively  large. 

An  alternative  approach  to  estimating  Nt  is  to  carry  out  Monte  Carlo  simulation 
on  Hi-  Suppose  we  sample  uniformly  at  random  from  Hi  for  T  times,  and  count  the 
number  of  cross-layer  cuts  W  in  the  sample.  The  Estimator  Theorem  guarantees  that 
("')  T  is  an  ^A_.)-approximation,  provided  that: 


T  > 


2  (m  +  1) 
5 


(3.2) 


where  pi  =  is  the  density  of  cross-layer  cuts  in  Hi.  The  main  issue  here  is 
that  the  exact  value  for  p;,  which  depends  on  N,,  is  unknown  to  us.  However,  if  we 
substitute  pi  in  Equation  (3.2)  with  a  lower  bound  of  pp  the  number  of  iterations  will 
be  guaranteed  to  be  no  less  than  the  required  value.  Therefore,  it  is  important  to 
establish  a  good  lower  bound  for  pt  in  order  to  keep  the  number  of  iterations  small 
while  achieving  the  desired  accuracy. 


3.4.2  Lower  Bounding  pi 

Given  a  layered  network,  suppose  its  Min  Cross  Layer  Cut  value  d  is  known,  Theo¬ 
rem  3.4  gives  a  lower  bound  on  pp 


Theorem  3.4  For  i  >  d.  p,  > 


Proof.  Since  d  is  the  Min  Cross  Layer  Cut  value,  there  exists  a  cross-layer  cut  S  with 
size  d.  Any  superset  of  S  with  i  physical  links  is  therefore  also  a  cross-layer  cut. 
Since  there  are  a  total  of  ("‘  '{)  such  supersets,  we  have  N.,  >  ,  and  the  theorem 

follows  immediately.  □ 
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Therefore,  we  can  use  p,  =  -j^y-  as  the  lower  bound  for  p-L  in  (3.2)  to  estimate 
Ni,  with  the  following  observations: 

1.  The  MCLC  value  d  needs  to  be  known  in  advance. 

2.  The  number  of  iterations  can  be  very  large  for  small  values  of  i.  For  example, 
when  i  =  d ,  the  number  of  iterations  T  required  is  In  2(r,'T' which  is  no 
better  than  enumerating  all  sets  in  Ha  by  brute  force. 

3.  The  lower  bound  pt  increases  with  i.  In  particular,  =  1  +  — -  Therefore, 

pi  It  1  Cl 

the  number  of  iterations  required  to  estimate  N j-  decreases  with  i. 

In  the  next  subsection,  we  will  present  an  algorithm  that  combines  the  enumera¬ 
tion  and  Monte  Carlo  methods  to  take  advantage  of  their  different  strengths.  In  Sec¬ 
tion  3.5,  we  will  present  enhanced  versions  of  the  algorithm  which  significantly  reduces 
the  number  of  iterations  by  establishing  a  much  tighter  lower  bound  on  p, .  The  final 
outcome  is  an  (e,  (^-approximation  algorithm  for  the  failure  polynomial  F(p)  that 
requires  only  a  polynomial  number  of  iterations. 

3.4.3  A  Combined  Enumeration  and  Monte  Carlo  Approach 

Recall  that  Nt  can  be  estimated  with  two  different  approaches,  brute-force  enumer¬ 
ation  and  Monte  Carlo.  The  two  approaches  can  be  combined  to  design  an  efficient 
(e,  (^-approximation  algorithm  for  the  failure  polynomial. 

The  key  observation  for  the  combined  approach  is  that  brute-force  enumeration 
works  well  wThen  i  is  small,  and  the  Monte  Carlo  method  works  well  when  i  is  large. 
Therefore,  it  makes  sense  to  use  the  enumeration  method  to  find  the  Min  Cross  Layer 
Cut  value  d,  as  well  as  the  associated  value  Nrj.  Once  we  obtain  the  value  of  d,  we 
can  decide  on  the  fly  whether  to  use  the  enumeration  method  or  the  Monte  Carlo 
method  to  estimate  each  Ni,  by  comparing  the  number  of  iterations  required  by  each 
method. 
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3.4.4  Time  Complexity  Analysis 


The  total  number  of  iterations  of  this  combined  approach  will  be: 


4(T)  ,2(m.+  l) 


J 


where  the  terms  inside  the  min  operator  are  the  number  of  iterations  required  by 
enumeration  and  Monte  Carlo  methods  respectively.  The  total  number  of  iterations 
can  be  upper  bounded  as  follows: 
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where  the  first  inequality  is  implied  by  the  following  lemma: 


Lemma  3.5  ^  (™)  <  (m  +  l)"1. 

i=0 


92 


Proof. 


Therefore,  the  algorithm  only  needs  a  polynomial  number  of  iterations  overall. 
The  improvement  in  running  time  of  this  combined  approach  is  illustrated  by  Figure  3- 
2. 


Figure  3-2:  Monte-Carlo  vs  Enumeration:  Number  of  iterations  for  estimating  Nh  for  a  network 
with  30  physical  links,  e  =  0.01 , 6  =  d  =  4.  The  shaded  region  represents  the  required  iterations 
for  the  combined  approach. 
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3.5  Improved  p%  Lower  Bounds  for  Reliability  Esti¬ 


mation 

The  running  time  performance  of  the  algorithm  introduced  in  the  previous  section 
hinges  on  the  tightness  of  the  lower  bounds  p,  used  for  the  algorithm.  In  this  section, 
we  discuss  ways  to  tighten  the  lower  bounds. 

The  idea  behind  these  improved  bounds  is  based  on  the  observation  that  any 
superset  of  a  cross-layer  cut  is  also  a  cross-layer  cut.  Let  T  —  {6j, . . . ,  Cn}  be  a 
collection  of  cross-layer  cuts.  For  each  Cj  G  T,  let  di{C:j)  C  H„  be  the  family  of 
supersets  of  C,  with  i  physical  links.  Similarly,  let  d,,{T)  —  be  the  union 

over  all  di(Cj).  Using  the  terminology  in  [22],  the  family  of  subsets  d^F)  is  called 
the  ith  upper  shadow  for  T .  The  following  theorem  provides  a  lower  bound  on  p,  in 
terms  of  dp. 


Theorem  3.6  Let  T  be  a  collection  of  cross-layer  cuts  with  size  less  than  i,  then 


Pi  > 


(7)  ' 


Proof.  Every  set  S  G  0,  (F)  is  a  superset  of  the  some  cross-layer  cut  in  F.  and  is 
therefore  a  cross-layer  cut  with  size  i.  Therefore,  dl(T)  is  a  collection  of  cross-layer 
cuts  with  size  i,  which  implies  \d-i{F)\  <  TV*.  It  follows  that  ^771  <  ^ry  —  Pi-  d 

Therefore,  if  we  know  the  value  of  |<9,;(JF)|,  we  can  use  as  the  lower  bound 

for  pi  in  the  Monte  Carlo  method  to  estimate  Nt.  Note  that  if  T  contains  only 
a  Min  Cross  Layer  Cut  of  the  network,  the  value  of  is  equal  to  the  bound 

given  by  Theorem  3.4.  Therefore,  Theorem  3.6  generalizes  the  lower  bound  result 
in  Section  3.4.2. 

Although  the  value  of  each  |<9i(Cj)|  =  ("'  can  be  computed  easily,  finding 
the  size  of  the  union  =  U c,eTdi{Cj)  can  be  difficult  because  the  sets  <);.{(  ’, ) 

are  not  disjoint.  Instead  of  computing  \di(J:)\  precisely,  we  introduce  techniques  for 
lower-bounding  |5,;(JF)|.  The  first  technique,  introduced  in  Section  3.5.1,  is  based 
on  importance  sampling  for  the  Union  of  Sets  problem  [64].  The  second  technique, 
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introduced  in  Section  3.5.2,  is  to  bound  the  size  of  dflF)  with  a  recursive  formula, 
based  on  the  Kruskal-Katona  Theorem  [22]. 

3.5.1  Lower  Bound  by  Approximating  Union  of  Sets 

Given  a  set  of  cross-layer  cuts  J r,  the  problem  of  estimating  the  size  of  its  upper 
shadow  OffF).  can  be  formulated  as  the  Union  of  Sets  Problem  [64],  for  which  a 
Monte-Carlo  based  approach  exists  using  the  technique  of  importance  sampling.  We 
summarize  the  result  in  this  section  and  leave  the  detailed  proofs  in  Appendix  3.11.1. 

Theorem  3.7  Let  T  =  {C%, . . . ,  Cn  }  be  a  collection  of  cross-layer  cuts  of  the  layered 
network.  For  each  Cj  E  J- ,  let  d,  (C3 )  be  the  ith  upper  shadow  of  Cv  There  exists  a 
Monte  Carlo  method  that  produces  an  [tu,,  5 n,)- approximation,  Lt,  for  L,  =  \di{J-)\, 
provided  that  the  number  of  samples  is  at  least: 

4!  jcj  9 

hi  (3.3) 

eib 


Proof.  Let  U  =  {( S,j )  :  j  €  {1, . . . ,  |JP|}  ,  S  E  djfCj)}  be  the  ground  set  for  the  Monte 
Carlo  algorithm,  and  let  G  —  {(S,j)  :  S  E  d,{T).j  —  min  { k  :  S  E  d,{Cf)}}  be  the 
events  of  interest.  We  show  in  Appendix  3.11.1  that  the  ground  set  U  can  be  sam¬ 
pled  uniformly  at  random.  Since  \G\  =  |<9,(JT)|  and  |^|  >  Theorem  3.7  follows 
immediately  from  the  Theorem  3.1.  □ 


Theorem  3.7  implies  p,  — 


is  a  lower  bound  on  p,  with  probability  at 


(1+0!,)  (T) 

least  1  —  8jh.  The  following  theorem  describes  how  such  a  probabilistic  lower  bound 


can  be  used  to  estimate  N,. 


Theorem  3.8  Let  Lt  be  an  (e//,.  hu,)- approximation  for  |5,(JT)|.  Then ,  the  Monte 
Carlo  method  described  in  Section  3.4.1  yields  an  (emc.  5\b  +  5 mc)- approximation  for 
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Ni,  provided  that  the  number  of  samples  is  at  least: 


T  = 

±  me 


4(1 +  '»)(?)  ln  2 


/l  T 
^  me.  i 


(3.4) 


Proof.  By  definition  of  Lu  the  probability  that  pi 


is  not  a  lower  bound 


on  pi  is  at  most  Sib.  Given  that  pi  is  a  lower  bound  for  p j,  by  the  Estimator  Theorem, 
the  probability  that  Nt  is  not  an  emc-approximation  for  Ni  is  at  most  Smc.  Hence,  by 
the  union  bound,  the  probability  that  none  of  these  "bad"  events  happen  is  at  least 
1  —  (5 ib  +  Smc).  and  the  theorem  follows.  □ 


To  apply  this  result  to  reliability  estimation,  we  can  modify  our  algorithm  pre¬ 
sented  in  Section  3.4.3  to  also  maintain  the  collection  T  of  cross-layer  cuts  as  we  carry 
out  the  enumeration  or  Monte  Carlo  methods.  Specifically,  as  we  discover  a  cross¬ 
layer  cut  Cj  with  size  i  when  estimating  Ni,  we  will  add  the  cut  Cj  to  our  collection 
T.  When  we  move  on  to  estimate  W+i,  we  will  have  a  collection  T  of  cross-layer 
cuts  with  size  i  or  smaller.  We  can  therefore  apply  Theorem  3.6  to  obtain  a  lower 
bound  for  W+i-  Note  that  the  size  of  dfT)  is  monotonic  in  T.  Therefore,  the  more 
cross-layer  cuts  that  are  included  in  T ,  the  better  the  lower  bound  is. 


3.5.2  Lower  Bound  based  on  Kruskal-Katona  Theorem 

We  can  also  derive  a  lower  bound  on  p.t  based  on  the  values  of  Ay  for  j  <  i,  using 
the  Kruskal-Katona  theorem.  Let  [m]  —  {1, . . . ,  m},  i.e.,  [m]  is  the  enumeration  of 
physical  links.  Let  Tif  =  {S  C  [m]  :  |5|  =  i}  be  a  family  of  subsets  of  [m]  with  size 
i.  For  any  T  C  Tif1  with  j  <  i,  we  denote  d'f  ifF)  to  be  the  Ith  upper  shadow  over  [m] 
for  T . 

We  define  the  lexicographic  ordering  on  Tif  as  follows:  Given  any  two  subsets  S'| 
and  S2  in  7 if,  Si  is  lexicographically  smaller  than  So  if  and  only  if  min  {i  :  i  €  S\  A.S'a}  £ 
Si,  where  A  denotes  the  symmetric  difference  between  the  two  sets,  i.e.,  SiAS2  — 
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.S']  U  So  —  Sj  n  S-2-  For  example,  the  set  {1,2,4}  is  lexicographically  smaller  than 
(1,  3,4}  because  the  smallest  element  where  the  two  sets  differ,  2,  is  in  the  first  set. 

Given  74™,  the  family  of  all  subsets  with  size  i,  let  74™  (b)  C  74”'  be  the  first  k 
elements  of  74'”  under  the  lexicographical  ordering.  The  Kruskal-Katona  theorem 
states  that  74"'  (k)  yields  the  smallest  upper  shadow  among  all  fc-subset  of  74'"; 

Theorem  3.9  (  [22])  For  any  i  <  j  and  T  C  74™, 


(3.5) 


In  other  words,  for  a  fixed  value  of  k,  the  upper  shadow  for  T  with  | T\  —  k:  is 
minimized  if  T  consists  of  the  first  k  subsets  of  74"'  in  lexicographical  order.  Therefore, 
suppose  a  multi-layer  network  has  a  Nt  cross-layer  cuts  with  size  i ,  Theorem  3.9 
implies  that  Nj  >  |cf"'(74'"(Ar,))|  for  all  j  >  i.  We  prove  the  following  recursive 
formula  for  |(7'"(74'ri(.V;)):: 

Theorem  3.10  Fori  <  j  <  m  and  1  <  k  <  let  w  —  max  {()  <  r  <  i  :  ("'_})  >  /,•} 
Also,  let  t  =  m—  (u;  +  1),  a  —  j  —  (w  +  1)  and  v  —  i  (ve  I  1).  Then: 


(”7)  •  Hk  =  i 

□  +  —  (*)))|,  otherwise. 


Proof.  See  Appendix  3.11.2.  □ 

When  estimating  N:/  in  the  jlh  round  of  the  algorithm  presented  in  Section  3.4.3, 
the  algorithm  has  already  discovered  a  collection  of  cross-layer  cuts  with  size  i  for 
each  i  <  j ,  either  by  sampling  or  exhaustive  enumeration.  Let  Nt  be  the  number 
of  cross-layer  cuts  with  size  i  seen  by  the  algorithm.  Then  Ny  is  lower  bounded  by 
max  |<9”'(74"‘(iV,;))|,  where  each  term  |<9"l(74"l(Ar);))|  can  be  computed  easily  using  the 

l<i<j  J  J 

recursive  formula  in  Theorem  3.10.  Notice  that  the  original  lower  bound  in  Theorem 
3.4  is  a  special  case  where  a  single  AICLC  is  assumed  and  (according  to  Theorem 
3.10)  Nj  is  lower  bounded  by  Idf  {FCf  (N d  =  1))|  =  (™7^)  f°r  ea°h  3  >  d-  Theorem 
3.10  improves  this  bound  by  accounting  for  more  cross-layer  cuts,  and  therefore,  it 
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can  be  used  to  further  reduce  the  number  of  iterations  required  by  the  Monte  Carlo 
algorithm.  We  note  however  that  the  enhanced  lower  bounds  obtained  by  Theorems 
3.7  and  3.10  may  still  result  in  the  same  order  of  0(rnd  log  m )  iterations.  Nevertheless, 
simulation  studies  in  Section  3.6  show  that  these  enhanced  bounds  can  substantially 
reduce  the  number  of  iterations. 

Finally,  a  probabilistic  lower  bound  for  Nj  can  also  be  established  by  using  the 
estimated  value  N>  instead  of  /V,.  In  that  case,  the  parameters  5  and  e  need  to  be 
adjusted  in  a  way  similar  to  Theorem  3.8. 

3.6  Empirical  Studies 

We  present  some  empirical  results  about  the  reliability  estimation  algorithms.  We 
compare  the  different  lower  bounds  for  A'j  produced  by  the  methods  described  in  Sec¬ 
tions  3.4  and  3.5,  and  look  at  the  number  of  iterations  required  for  different  variants 
of  the  estimation  algorithm.  In  addition,  we  will  compare  the  actual  accuracy  of 
the  failure  polynomials  computed  by  the  algorithm  with  the  theoretical  guarantee 
provided  by  the  Estimator  Theorem. 


Figure  3-3:  The  augmented  NSFNET. 


We  used  the  augmented  NSFNET  (Figure  3-3)  as  the  physical  topology.  We 
generated  350  random  logical  topologies  with  6  to  12  nodes  and  created  lightpath 
routings  using  the  MCF  (Multi-Commodity  Flow)  algorithms  described  in  Chapter  2. 
For  each  lightpath  routing,  we  ran  four  different  reliability  estimation  algorithms  to 
compute  their  failure  polynomials: 

1.  ENUM:  Each  value  of  N,  is  computed  by  enumeration. 
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2.  M IXEDor!ginai  •  The  original  algorithm  that  combines  the  enumeration  and  Monte 
Carlo  methods,  introduced  in  Section  3.4.3,  with  e  =  8  =  0.01. 

3.  MIXEDkk:  The  algorithm  that  combines  the  enumeration  and  Monte  Carlo 
methods,  using  Theorem  3.10  to  derive  the  lower  bound  for  pt. 

4.  MIXEDsamp|e:  The  algorithm  that  combines  the  enumeration  and  Monte  Carlo 

methods,  using  the  importance  sampling  technique  in  Section  3.5  to  derive  the 
lower  bound  for  p,.  In  this  case,  we  have  picked  emc  =  0.01,  =  0.1,  5mc  = 

j2-  For  the  collection  C  of  cross-layer  cuts,  we  only  keep  the  100  smallest 
cross-layer  cuts. 

Table  3.1  shows  the  average  number  of  iterations  required  for  each  algorithm  to 
compute  the  failure  polynomial.  The  result  shows  that  the  combined  enumeration 
and  Monte  Carlo  approach  helps  to  significantly  reduce  the  number  of  iterations. 
In  addition,  the  algorithms  MIXEDkk  and  MIXEDsarnp|e  is  able  to  further  reduce  the 
number  of  iterations  by  exploiting  the  knowledge  of  the  discovered  cross-layer  cuts. 

Between  the  two  enhanced  algorithms,  algorithm  MIXEDsamp|e  in  general  achieves 
a  better  lower  bound,  as  shown  in  Figure  3-4,  because  of  the  the  additional  impor¬ 
tance  sampling  step.  However,  for  small  regimes  of  i  where  the  number  of  iterations 
dominates,  the  lower  bounds  from  the  two  algorithms  are  close  enough  that  the  dif¬ 
ference  in  the  number  of  iterations  is  small.  In  addition,  since  algorithm  MIXEDsamp|e 
requires  the  additional  importance  sampling  step,  the  overall  number  of  iterations 
required  by  the  two  algorithms  are  close  to  each  other. 


Algorithm 

Monte  Carlo  Iterations 

N,  Estimation 

pi  Estimation 

Total 

ENUM 

536,870,912 

N/A 

536,870,912 

M  IXEDorigjnai 

46,900,857 

N/A 

46,900,857 

MIXEDKK 

15,467,815 

N/A 

15,467,815 

MIXEDsample 

11,968,535 

2,485,477 

14,454,012 

Table  3.1:  Number  of  iterations  for  each  algorithm. 


Finally,  we  compare  the  actual  accuracy  of  the  failure  polynomial  generated  by 
algorithm  MIXEDsamp|e  with  the  theoretical  guarantee  given  by  the  Estimator  Theo- 
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Figure  3-4:  Lower  bounds  for  JVj  produced  by  MIXEDoriginai  and  MIXEDenhanced- 
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Figure  3-5:  Number  of  iterations  to  estimate  M  by  each  algorithm. 


rem.  Figure  3-6  shows  the  accuracy  results  on  two  sets  of  failure  polynomials,  with 
Monte  Carlo  parameters  e  =  0.01  and  0.05.  For  each  set  of  failure  polynomials,  we 
compute  the  maximum  relative  error  among  them  for  various  values  of  p.  Therefore, 
each  curve  shows  the  upper  envelope  of  relative  errors  by  the  failure  polynomials.  In 
both  cases,  the  relative  error  is  much  smaller  than  the  theoretical  guarantee.  Ihis 
is  because  by  using  a  lower  bound  for  ph  the  algorithm  over-samples  in  each  Monte 
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Carlo  approximation  for  N{.  In  addition,  the  errors  for  the  Nt  estimates  are  inde- 
pendent  and  may  cancel  out  each  other.  Therefore,  in  practice,  the  algorithm  would 
provide  much  better  estimates  than  theoretically  guaranteed. 


P 


Figure  3-6:  Relative  error  of  the  failure  polynomial  approximation. 


3.7  Estimating  Cross-Layer  Reliability  with  Abso¬ 
lute  Error 

We  have  considered  computing  relative  approximation  for  the  failure  polynomial  F(p). 
However,  in  certain  contexts,  it  may  make  sense  to  describe  the  error  in  absolute 
terms.  A  function  F(jp)  is  e-absolute-approximate  to  F(p)  if: 

I  F{p)  ~  F(p)  |  <  e. 

For  example,  if  our  goal  is  to  design  a  network  with  a  certain  reliability  target 
(say  five  9s),  it  is  sufficient  to  present  a  network  whose  associated  failure  polynomial 
has  absolute  error  in  the  order  of  10-6.  Constructing  a  failure  polynomial  with  such 
relative  error,  however,  may  be  overly  stringent. 

A  function  that  is  e-approximate  to  F(p)  immediately  implies  that  it  is  e-absolute- 
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approximate.  As  it  turns  out,  using  a  similar  approach  of  probabilistically  estimating 
each  Ni  requires  a  much  smaller  number  of  samples  to  achieve  e-absolute  accuracy. 
The  total  number  of  iterations  required  to  compute  an  e-absolute-approximation  for 
F(p)  with  high  probability  is  O(mlogm),  in  contrast  to  O{md\ogm)  in  the  case  of 
e-approximation. 

The  intuition  behind  the  difference  is  that,  computing  an  e-approximation  for  Ni  is 
difficult  when  the  density  pt  is  small.  However,  in  that  case,  the  absolute  contribution 
of  the  term  —  p)7™  =  f>,  ("l)pl(l  —  p)m~l  will  be  small  as  well.  Therefore,  in 

this  case,  even  a  large  relative  error  for  N,L  will  only  account  for  a  small  absolute  error. 

More  precisely,  by  the  Estimator  Theorem,  the  Monte  Carlo  method  yields  an 
(-^=,  -^^-approximation  for  Nt  with  =j-  In  samples.  In  other  words,  if  we  run 

the  Monte  Carlo  method  with  O(logm)  samples  to  estimate  each  Nn  we  can  obtain 
-^-approximations  Nt  for  all  Nt  with  probability  at  least  1  —  5.  This  implies: 

V  Pi 

m  m 

|  £(A',  -  N,)p'(  1  -  p)"-|  <  J2  -4=JVip‘(l  -  vT' 

j=o  2=0  v  A* 

'»  /  X 

seE(m)p'(1-rtw'7'* e- 
2=0  ^  1  ' 

This  means  that  we  can  compute  e-absolute-approximation  for  the  failure  poly¬ 
nomial  F(p)  with  high  probability  with  a  total  of  0{m  logm)  iterations.  Unlike  the 
case  for  e-approximation,  the  number  of  iterations  is  independent  of  the  Min  Cross 
Layer  Cut  value  d.  This  makes  the  method  efficient  even  in  the  settings  where  d  can 
be  large. 


3.8  Improving  Reliability  via  MCLC  Maximization 

As  illustrated  in  Section  3.1,  lightpath  routing  in  a  layered  network  plays  an  impor¬ 
tant  role  in  the  reliability.  Designing  a  lightpath  routing  that  maximizes  reliability, 
however,  is  a  very  complex  problem.  As  we  have  seen  in  Figure  3-1,  a  lightpath  rout¬ 
ing  that  is  optimal  for  a  certain  value  of  p  may  not  perform  as  well  for  other  values  of 
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p.  This  makes  the  network  design  aspect  of  cross-layer  reliability  a  challenging  and 
interesting  problem. 

In  this  section,  we  study  the  reliability  performance  of  several  lightpath  routing 
algorithms  presented  in  Chapter  2,  whose  objective  is  to  maximize  the  Min  Cross 
Layer  Cut  (MCLC).  As  discussed  in  Section  3.3.1,  maximizing  the  MCLC  is  closely 
related  to  maximizing  reliability,  especially  for  small  values  of  p.  The  relationship 
between  the  two  quantities  is  described  by  Theorem  3.11.  We  state  the  main  result 
relevant  to  this  chapter  here.  The  proof  will  be  given  in  Chapter  4,  where  a  generalized 
version  of  the  theorem  is  presented. 

Assume  that  logical  and  physical  topologies  are  given.  Consider  two  lightpath 
routings  1  and  2  for  these  topologies.  Let  d  be  the  MCLC  of  lightpath  routing  1,  and 
Fi(p)  be  its  failure  polynomial.  Similarly,  let  c  and  F-2(p)  be  the  MCLC  and  failure 
polynomial  of  lightpath  routing  2,  respectively.  The  failure  polynomials  Fx  ( p )  and 
F2(p)  are  given  by 

Mp)  =  E,'”,/  i  -  p)m-i 
f-ap)  e;"..-^a;u  '//!'■•'  '• 

Theorem  3.11  Assume  d  >  c.  Then,  there  exists  a  positive  number  p0  such  that 
Fi(p)  <  F2 ip)  for  p  <  pa-  In  particular, 

_  (c  +  l)Mc 

Po“  MT)  ' 

Motivated  by  Theorem  3.11,  we  will  investigate  the  reliability  performance  of  the 
lightpath  routing  algorithms  studied  in  Chapter  2,  whose  objectives  are  to  maximize 
the  MCLC. 

3.8.1  Simulation  Studies 

In  Chapter  2,  we  showed  that  the  multi-commodity  flow  algorithm,  MCFMinCut,  and 
its  enhanced  version,  MCFLp,  outperform  the  existing  survivable  lightpath  routing 
algorithm,  SURVIVE  [76],  in  terms  of  MCLC  performance.  Since  MCLC  is  closely  tied 
to  cross-layer  reliability,  it  is  therefore  interesting  to  see  whether  a  similar  observation 
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holds  in  terms  of  reliability  to  random  failures. 

We  used  the  augmented  NSFNET  (Figure  3-3)  as  the  physical  topology,  and  gen¬ 
erated  350  random  logical  topologies  with  size  from  6  to  12  nodes  and  connectivity 
at  least  4.  We  study  the  reliability  performance  of  the  three  lightpath  routing  algo¬ 
rithms:  MCFlf,  MCFMinCut  and  SURVIVE.  For  each  lightpath  routing  generated  by 
the  algorithms,  we  compute  an  approximate  failure  polynomial  using  the  technique 

proposed  in  Section  3.5,  and  evaluate  its  reliability. 

Figure  3-7  shows  the  cumulative  distributions  of  reliability  for  the  lightpath  rout¬ 
ings  generated  by  the  three  algorithms,  with  p  =  0.1.  The  multi-commodity  flow 
based  algorithms,  which  try  to  maximize  the  MCLC  of  the  lightpath  routings,  were 
able  to  generate  more  lightpath  routings  with  higher  reliability  than  SURVIVE,  whose 
objective  is  to  find  a  lightpath  routing  with  MCLC  at  least  two.  For  small  p,  the 
term  Ndpd(  1  -  p)m-d,  where  d  is  the  Min  Cross  Layer  Cut,  dominates  other  terms 
in  the  failure  polynomial.  Therefore,  maximizing  d  has  the  effect  of  maximizing  the 
reliability  of  the  network. 


Figure  3-7:  Reliability  CDF  for  different  algorithms  with  p  =  0.1,  which  shows  the  number  of 
instances  with  unreliability  less  than  the  value  given  by  the  x-axis. 

The  dependence  of  reliability  on  lightpath  routing  and  link  failure  probability  p  is 
further  illustrated  by  Figure  3-8,  which  plots  the  ratio  and  absolute  difference  of  av- 
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(a)  Unreliability  ratio 


(b)  Unreliability  difference 

Figure  3-8:  Ratio  and  absolute  difference  of  average  unreliabilities  among  different  algorithms. 

erage  failure  probabilities  of  the  lightpath  routings  generated  by  the  three  algorithms, 
using  MCFMincut  as  the  baseline.  When  p  is  small,  the  multi-commodity  flow  routing 
algorithms  are  clearly  better  than  SURVIVE  in  terms  of  the  average  reliability.  How¬ 
ever,  as  p  gets  larger,  the  difference  in  reliability  performance  among  the  algorithms 
diminishes.  In  fact,  as  seen  in  Figure  3-8(b),  the  reliability  of  all  three  algorithms 
are  very  close.  This  is  because  for  large  p,  the  unreliability  for  any  lightpath  routing 
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would  be  very  close  to  1. 


Figure  3-9  compares  the  average  Ni  values  of  the  lightpath  routings  generated  by 
the  algorithms.  Again  using  MCFMinCut  as  the  baseline,  Figure  3-9  shows  that  none 
of  the  algorithms  dominate  the  others  in  all  Ni  values.  The  multi-commodity  flow 
algorithms  try  to  maximize  the  Min  Cross  Layer  Cut  at  the  expense  of  creating  more 
cross-layer  cuts  of  larger  size.  The  objective  for  SURVIVE,  on  the  other  hand,  is  to 
minimize  the  total  number  of  physical  hops  subject  to  the  constraint  that  MCLC  is 
at  least  two.  In  an  environment  where  p  is  high,  minimizing  the  physical  hops  may 
be  a  better  strategy,  as  we  have  seen  in  Figure  3-1.  This  is  reflected  by  the  fact  that 
lightpaths  routings  produced  by  SURVIVE  have  smaller  average  Nt  values  when  i  is 

large. 

In  the  setting  of  WDM  networks,  we  expect  p  to  be  typically  small.  Therefore, 
maximizing  the  Min  Cross  Layer  Cut  appears  to  be  a  reasonable  strategy.  However, 
it  is  important  to  keep  in  mind  that  the  same  insight  may  not  apply  to  other  settings 
where  physical  links  fail  with  high  probability  (e.g.  Delay  Tolerant  Networks). 


Figure  3-9:  Difference  in  average  Ni  values  among  different  algorithms. 
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3.9  Extensions  to  the  Failure  Model 

In  this  section,  We  present  a  few  extensions  to  the  failure  model  and  discuss  the 
application  of  the  reliability  estimation  method  to  these  extensions. 


3.9.1  Non-uniform  Failure  Probabilities 

In  the  non-uniform  physical  link  failure  model,  each  physical  link  (ij)  fails  with 
probability  pXJ.  The  physical  topology  can  be  approximated  by  replacing  each  physical 
link  (t,j)  by  k  =  round  (^7^)  physical  links  in  series,  where  round()  is  the 
rounding  function  and  p  is  a  constant  that  represents  the  link  failure  probability  of 
the  transformed  network  (Figure  3-10).  In  this  case,  the  probability  that  none  of  the 
replacements  for  (i.  j)  fail  equals: 


,  '°g(»-Pip 

(1  ~  P  )'  =  (!-  P  )  w-p  )  (1  -  p'y  =  (i  _  p..y  i  _  p'y 


where  |e| 


( 'ogO-p.,) 

\  log(l-p') 


log(l -/),,) 
log(l-p') 


)  I  ^  0-5-  Therefore,  this  probability 


can 


be  made  arbitrarily  close  to  1  -  Pij  by  choosing  a  sufficiently  small  p,  with  the 
tradeoff  being  a  larger  number  of  new  links.  In  this  case,  the  lightpath  routing  can 
then  be  modified  such  that  a  logical  link  originally  using  (i,j)  is  now  routed  over  its 
replacements.  This  gives  us  an  equivalent  layered  network  where  every  physical  link 
fails  independently  with  probability  p. 


o 


k  links 


Figure  3-10:  A  physical  link  with  failure  probability  p  is  equivalent  to  k 
physical  links  in  series  with  failure  probability  p. 


=  loS(l  p) /  log(  1  -  p) 


3.9.2  Random  Node  Failures 

The  reliability  estimation  method  can  be  extended  to  a  model  where  each  physical 
link  fails  with  probability  p  and  each  physical  node  fails  with  probability  q.  We  can 


107 


model  a  network  state  as  the  set  of  failed  physical  nodes  and  links,  and  a  logical  link 
will  fail  if  any  of  the  physical  nodes  and  links  it  uses  fail.  In  this  case,  a  cross-layer 
cut  is  a  set  of  physical  nodes  and  links  whose  failures  would  cause  the  logical  topology 
to  be  disconnected.  The  reliability  of  the  layered  network  can  then  be  expressed  as 
follows: 

m  n 

Y.T,N»pi<- 

/  0  j  0 

where  m,  n  are  the  numbers  of  physical  links  and  nodes  respectively,  and  Ni:j  is  the 
number  of  cross-layer  cuts  with  i  failed  physical  links  and  j  failed  physical  nodes. 
Then  we  can  estimate  the  reliability  in  a  similar  fashion,  by  approximating  each 
separately  via  the  Monte-Carlo  method.  To  estimate  NtJ ,  network  states  with  i  fibers 
and  j  nodes  will  be  uniformly  sampled.  The  methods  in  Sections  3.4.2  and  3.5.1 
to  establish  lower  bounds  on  N,  can  be  extended  to  establish  lower  bounds  on  Wp 
based  on  a  similar  observation  in  this  setting  that  any  network  state  that  contains  a 
cross-layer  cut  is  also  a  cross-layer  cut. 


3.10  Conclusion 

We  consider  network  reliability  in  multi-layer  networks.  In  this  setting,  logical  link 
failures  can  be  correlated  even  if  physical  links  fail  independently.  Hence,  conven¬ 
tional  estimation  methods  that  assume  particular  topologies,  independent  failures, 
and  network  parameters  cannot  be  used  for  our  problem.  To  that  end,  we  develop 
a  Monte  Carlo  simulation  based  estimation  algorithm  that  approximates  cross-layer 
reliability  with  high  probability.  We  first  extend  the  classical  polynomial  expression 
for  reliability  to  multi-layer  networks.  Our  algorithm  approximates  the  failure  poly¬ 
nomial  by  estimating  the  values  of  its  coefficients.  The  advantages  of  our  approach 
are  two  fold.  First,  it  does  not  require  resampling  for  different  values  of  link  failure 
probability  p.  Second,  with  a  polynomial  number  of  iterations,  it  guarantees  the  ac¬ 
curacy  of  estimation  with  high  probability.  We  also  observe  through  the  polynomial 
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expression  that  lightpath  routings  that  maximize  the  MCLC  can  perform  very  well 
in  terms  of  reliability.  This  observation  leads  to  the  development  of  lightpath  routing 
algorithms  that  attempt  to  maximize  reliability. 

While  sampling  failure  states,  our  estimation  algorithm  naturally  reveals  the  vul¬ 
nerable  parts  of  the  network  or  lightpath  routing.  This  information  can  be  used  to 
enhance  the  current  lightpath  routing.  In  Chapter  5  we  will  explore  different  ap¬ 
proaches  of  improving  the  reliability  of  a  network  using  such  information. 


3.11  Chapter  Appendix 

3.11.1  Approximating  Union  of  Sets 

As  seen  in  Section  3.5,  given  a  set  of  cross-layer  cuts  T,  the  value 


mr)\ 

(?) 


gives  a  lower 


bound  for  pr.  We  will  discuss  in  this  section  how  to  estimate  the  size  of  d,{T)  ~ 
Ur;!:  j  0; ( Cj )  probabi listically. 

Computing  the  value  of  |<9j(.F)|  can  be  formulated  as  the  Union  of  Sets  Prob¬ 
lem  [64],  where  Monte  Carlo  method  exists  to  estimate  the  size  of  |<9i(JF)|  using  the 
technique  of  importance  sampling.  Here,  we  define  the  ground  set  U  to  be: 


and  the  events  of  interest  G  to  be 

G  ■■=  {(S,j)  :  S  G  di{T),j  =  min  {k  :  S  G  <%(C?)}}  . 

In  other  words,  the  ground  set  U  represents  a  multi-set  where  each  set  S  in  OfA) 
is  represented  k  times  in  U,  where  k  is  the  number  of  elements  in  T  that  are  subsets 
of  S.  On  the  other  hand,  each  set  S  in  0,  { T)  is  represented  by  exactly  one  element 
(-S',  j)  in  G ,  where  C3  is  the  first  element  in  T  that  is  a  subset  of  S.  As  a  result,  for 
each  S  G  d,{T),  \  {(T,j)  e  U  :  T  =  S}  |  <  \F\,  and  |  {(T,  j)  G  G  :  T  =  S}  \  =  1.  It 
immediately  follows  that: 
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|C|  =  IWI, 

and 

H>_L 

\u\  ~  \r{ 

Therefore,  by  the  Estimator  Theorem,  if  we  sample  from  U  uniformly  at  random 
for  T  times,  where: 


T  — 


In 


> 


Pb 


c2  M  ln  6, 

-lb\U\ 


lb 


the  Monte  Carlo  method  will  yield  an  ^-approximation  for  |G|,  which  is  equal  to 
\di(T)\,  with  probability  at  least  1  -  5U). 

Finally,  the  sample  space  U  can  be  sampled  uniformly  at  random  as  follows: 


1.  Select  an  element  j  from  {1, . . . ,  |JF|},  where  the  probability  of  selecting  j  is 

Note  that  I =  (T-iqi1)’  which  can  be  computed  easily. 

2.  Given  the  selected  value  j,  pick  a  set  S  6  dL (C, )  uniformly  at  random. 

The  probability  of  selecting  each  element  {S.j)  G  U  is  therefore: 

^(E)!  1  1  1 

E  mch)\  me y)i  e  mck)\  i uy 

This  gives  us  a  method  to  establish  a  probabilistic  lower  bound  p,  for  p, . 


3.11.2  Proof  of  Theorem  3.10 

Let  [m]  =  m)  and  let  H"1  =  {S  C  [m]  :  \S\  =  ?'}  be  a  family  of  subsets  of  [m] 

with  size  i,  and  let  'H'l"(k)  be  the  hrst  k  subsets  in  under  the  lexicographical 
ordering.  In  addition,  for  any  family  T  of  subsets  of  [m]  and  for  any  j  >  i,  let  dJ  (T) 
be  the  jth  upper  shadow  of  T  over  \m\.  Theorem  3.10  states  that: 
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Theorem  3.10  Fori  <  j  <  rn  and  1  <  A:  <  ("')7  let  w  =  max  {()  <  r  <  i  :  >  A;}. 

Also,  let  t  =  rn  —  [tr  +  1),  u  =  j  —  (w  +  1)  and  v  =  i  —  (w  +  1).  Then: 


1 9T(nrm 


(u)  +  1 1 9^+1(H‘+1(fc  —  Q)))|,  otheriuise. 


The  case  for  k  =  1  follows  from  the  fact  that  for  a  set  with  size  i,  it  has  ("■!’) 
supersets  with  size  j.  We  will  prove  the  case  where  k  >  1  in  the  rest  of  the  section. 

Let  S  be  the  lexicographically  largest  element  in  Ti'-  Hk).  We  first  prove  the  fol¬ 
lowing  lemma: 

Lemma  3.12  [tv]  C  S  and  w  +  10  S. 

Proof.  Suppose  the  lemma  is  not  true.  We  have  the  following  two  cases: 

1.  S  does  not  contain  some  element  e  £  [w].  In  this  case,  all  subsets  of  TVf  that 
contains  [w]  are  lexicographically  smaller  than  S  and  thus  belong  to  TVffk). 
Therefore,  k  —  [Kfik)]  >  -  This  contradicts  with  the  fact  that  ("'7J)  > 

k. 


2.  S  contains  [w  +  1].  So  any  set  T  G  TL'f  that  does  not  contain  [iv  +  1]  is 
lexicographically  greater  than  S,  and  therefore  cannot  be  in  H'"(k).  As  a  result, 
k  =  \TL™(k)\  <  (7r((J+11))).  However,  by  definition  of  w,  we  have  ("iT((J+11)))  <  k , 
which  is  a  contradiction. 


□ 

Corollary  3.13  All  dements  in  Tt'f(k)  must  contain  [u>]. 

Proof.  Any  element  in  'H',"(k)  must  be  lexicographically  at  most  S,  and  therefore 
must  contain  [w] .  □ 

Corollary  3.14  All  elements  in  Tif  that  contain  [w  +  1]  are  in  'H™(k). 

Proof.  Any  element  that  contains  [w  +  1]  are  lexicographically  smaller  than  S,  and 
therefore  belongs  to  TCfik).  □ 
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We  now  partition  the  family  Hfl{k)  into  two  sub-families: 

•  Hf(k)+  :=  {T  G  Hf  {k)  :  w  +  1  G  T} 

•  H™(k)~  :=  {T  G  HT{k)  :  w  +  1  £  T} 

As  a  result  of  Corollaries  3.13  and  3.14,  7i^(k)+  consists  of  all  (f^.  ,  |V)  elements 

in  H™  that  contain  [w  +  1],  and  Hfl(k)~  consists  of  the  next  k  —  elements 

in  the  lexicographical  order.  We  define  a  bijection  gw  on  Hfl(k)~  as  follows: 

gw(T)  :=  (e  -  (w  +  1)  :  e  G  T  -  [w]}  ,  VT  G  HT{k)-.  (3.6) 

In  other  words,  for  any  T  G  we  construct  gw(T)  by  first  removing  the 

common  subset  [iv]  from  T  and  then  subtracting  each  remaining  element  by  w  +  1. 
As  a  result,  each  gw(T)  is  a  subset  of  [m,  —  (w  +  1)].  The  image  gw(H’in(k)~)  consists 
of  the  first  k  —  )  subsets  of  [m  -  (w  +  1)]  size  i  —  w  in  lexicographical  order. 

In  other  words,  we  have: 

«MWT(*r)  =  K--Jw+1)(k  -  (”;((J+ 1)))-  <3-7) 

Now,  consider  ('H"1  {k)).  the  jth  upper  shadow  over  [m]  for  H™  (k).  As  a  result 
of  Corollary  3.13,  all  elements  in  (k))  must  contain  n)].  We  can  therefore 

partition  d™ (H™ (k))  in  a  similar  fashion: 

•  d73n(HT{k))+  :=  { T  G  d™{H™{k))  :  w  +  1  G  T} 

•  &r(H?(k))~  :=  {T  G  |  :  w  +  1  T} 

We  now  prove  the  following  properties  of  djl  (Hfl  (k))+  and  d™(H ■n(k))~,  which 
allow  us  to  express  the  cardinality  of  the  upper  shadow  in  Theorem  3.10. 

Lemma  3.15  d f(H?{k))+  =  {T  G  H™  :  [w  +  1]  C  T}. 

Proof.  Every  element  T  in  d™ (TC'f  (k))+  must  contain  [w],  by  Corollary  3.13,  and 
w  +  1,  by  definition.  Therefore,  T  must  contain  [w  +  1].  In  addition,  for  any  element 
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T  in  7i™  that  contains  [w  +  1],  let  U  be  the  set  with  the  i  smallest  elements  in  T. 
Since  i  >  w  +  1,  U  contains  [w  +  1]  and  is  in  K"l(k)  by  Corollary  3.14.  As  a  result, 
the  subset  T,  being  a  superset  of  U,  is  in  the  jlh  upper  shadow  of  H™ (k'j.  □ 


Corollary  3.16  |9”(«”(0)+l  =  ("AAd 


Lemma  3.17 


=  8T-, :!"'+I,(fc(«r(*n)- 

Proof.  For  any  element  T  G  d’f  [H]”  ik)Y\i  there  must  exist  an  element  U  G  H''l(k) 
such  that  U  C  T.  Since  u-  +  1  ^  T,  it  follows  that  w  +  1  ^  U ,  which  implies 
U  G  .  By  applying  the  same  bijection  gw  to  df{Kfl{k))~ ,  gw[T)  is  a  subset 

of  [m  —  (w  +  1)]  with  size  j  —  tv,  and  is  a  superset  of  gw(U).  In  other  words: 

s  a^!'+«(j»(«r(*)-)). 

Now  given  T  e  l  }):  there  exists  U  e  q.q'Hdd.)  i  such  that 

U  C  T.  It  follows  that  g~l(U)  C  gwl{T).  Since  g~l{U)  G  it  follows  that 

fj~ 1  (I  )  G  djn(7i7fl(k)^).  Therefore,  T  G  gw{drfl (TV,™ (k)~)) ,  which  means 

giwmrtk))-)  2 


which  proves  the  lemma. 


□ 


Corollary  3.18 


mnrm-i  =  \& 


lu  ■  D  rJ,l:i  <ir  ■  t) 


j-w 


’  in- 
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Proof. 


\a?(H?(k)r  i  = 

=  \d?P‘*i>(g^Wm-))\ 

_  |^f'l-(oi+l)/T/in-(uJ+i)/,  _  A"  ~  ( W  +  ^)\  \\| 

The  second  equality  is  due  to  Lemma  3.17,  and  the  third  equality  is  due  to  Equa¬ 
tion  (3.7).  □ 


The  expression  for  \d™ (H™ {k))\  for  k  >  1  follows  immediately  from  Corollaries 
3.16  and  3.18. 


3.11.3  Estimating  Reliability  by  Importance  Sampling 

As  discussed  in  Section  3.4,  estimating  reliability  by  directly  simulating  physical  link 
failures  requires  a  large  sample  size  when  the  link  failure  probability  p  is  small,  due 
to  the  large  coefficient  of  variation  of  the  estimator.  In  this  section,  we  discuss  how 
importance  sampling  can  be  used  to  reduce  the  coefficient  of  variation. 

Given  the  physical,  logical  topologies  and  a  lightpath  routing.  Let  77  be  the  sample 
space,  that  is,  all  possible  subsets  of  the  physical  links  E>.  Given  a  network  state 
S  £77,  the  0-1  random  variable  U ( S )  is  defined  to  be  1  if  and  only  if  S  is  a  cross-layer 
cut.  Suppose  each  physical  link  fails  with  probability  p.  Then  the  unreliability  of  the 
layered  network  is  simply  the  expected  value,  EP(U),  of  U.  where  the  subscript  p 
indicates  that  the  expectation  is  taken  over  the  probability  distribution  where  every 
physical  link  fails  with  probability  p.  It  can  be  written  as  follows: 
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EP(U)  =  J2  U(S)Pr(S) 


Sen 


i-p)"" 


2=0 

rn 


?:=o 


p'‘(i  -  p')’ 


Ey(E'), 


(3.8) 


where  Nt  is  the  number  of  cross-layer  cuts  with  size  i,  and  1/' ,  called  the  likelihood 
ratio  estimator,  is  a  random  variable  on  7d  such  that: 


U'(S) 


(h 


if  S  is  a  cross-layer  cut,  where  |5|  —  i 
if  S  is  not  a  cross-layer  cut. 


Equation  (3.8)  implies  that  the  expected  value  for  U  at  link  failure  probability  p 
is  equal  to  the  expected  value  for  U’  at  link  failure  probability  p  .  Therefore,  we  can 
sample  the  value  for  U‘  at  link  failure  probability  p  to  obtain  an  unbiased  estimate 
on  the  unreliability  of  the  network  at  link  failure  probability  p.  The  variance  of  U‘  is 
given  by: 


VarpU’)  =  -  (e„.(C/'))' 


=  V  NiP'<(l  -  p'Y 


-i  f  P‘(l  ~P)" 
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-  (EP(U)Y 
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=  VP(U')~(EP(U))2. 


(3.9) 
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Estimating  Unreliability  at  Small  p  With  Importance  Sampling 


The  major  design  decision  involved  in  importance  sampling  is  the  choice  of  the  new 
sampling  distribution,  which,  in  our  case,  is  the  choice  of  p  .  Since  direct  sampling  is 
less  effective  when  the  link  failure  probability  p  is  small,  it  makes  sense  to  choose  p 
to  optimize  for  this  case. 


Consider  a  lightpath  routing  with  Min  Cross  Layer  Cut  value  d.  When  p  is 
sufficiently  small,  the  value  of  ~EP{U)  can  be  bounded  as  follows: 


Ndpd(  1  -  />)"'  ^  N,p'(  1  -  p)"'  ‘ 

i=d 

m 

=  NdPd(  1  -  p)"‘  ’>  +  W(1  -  pf 

i=d-\- 1 


(3.10) 


(3.11) 


where  €  is  a  small  constant.  Similarly,  the  value  of  Ep{U’ )  can  be  bounded  as  follows: 


k/1~pL^  <E  awi-ri' 


pd{  i  —  p'y 


i=d 


'  pl{\-p)m 
yp'i(i-p')r 


(3.12) 


p2d{  1  -  ^  p2i{  1  -  p)2('»-b 

d  p'd(  1  —p')m~d  1  pi{  1  — 

_  >,  j2i  •’!-  '/l 


(3.13) 


Therefore,  the  squared  coefficient  of  variation  for  U  is  bounded  as  follows: 
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1.  by  (3.9) 


Varp/(U')  __  EP(U') 

MM)2  MM)2 

(1  +  e)N,rp2d(t  ~  P)2im-d) _ 1 _ 

—  pd(l  —  p')m~d  Njp2d(l  —  p)2(m~d) 

by  (3.10)  and  (3.13) 

< _ i±l _ 1, 

~  Ndpd(l-p')m-d 


and: 


Varp.  (If)  EP(U') 


-  1. 
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by  (3.9) 
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1, 


by  (3.11)  and  (3.12) 
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(1  +  e')2Ndp'd{l  —  p')m-d 


-  1. 


The  term  --  , 


..  >dt,  '  is  minimized  when  p  -  — .  Therefore,  if  the  Monte  Carlo 

NdP  d(l—  p)  ~d  1  m  ’ 

method  samples  network  states  at  p  —  — ,  the  squared  coefficient  of  variation  will  be: 


Varp,(U') 

EP'{U T 


=  0 

-  0 


(M(  i  -  ^)" 

km'  k  m  ' 


-  0(md). 


Therefore,  the  sample  size  to  establish  a  (e.  ^-approximation,  which  is  propor¬ 
tional  to  the  squared  coefficient  of  variation  [97],  is  Q(md)  when  p  is  small.  Like  the 
algorithm  introduced  in  Section  3.4,  the  knowledge  of  the  Min  Cross  Layer  Cut  value 
d  is  needed  to  carry  out  the  Monte  Carlo  method  efficiently.  This  gives  us  the  follow- 
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mg  importance  sampling  algorithm  IS  to  effciently  estimate  the  cross-layer  reliability 
when  p  is  small. 


Algorithm  3  IS 

1:  Compute  MCLC  value  d  for  the  lightpath  routing. 

2:  Simulate,  for  T  =  Q(rnd )  times,  the  event  that  each  phyical  link  fails  with  prob¬ 
ability  p  =  Let  C  be  the  set  of  the  T  samples  collected. 

3:  For  each  i  £  {0, . . . ,  m},  count  the  number  of  cross-layer  cuts  in  C  with  exactly  i 
physical  links,  and  denote  the  count  as  Mt . 

4:  For  any  link  failure  probability  p,  the  estimated  unreliability  is  given  by 


EM 


i= 0 


p''(l-p')m  1 


By  setting  p  to  the  algorithm  IS  is  maximizing  the  likelihood  of  sampling 
network  states  with  d  fibers,  thereby  achieving  the  best  estimate  on  the  number  of 
small  cross-layer  cuts,  which  contribute  to  the  majority  of  the  unreliability  when  p  is 
small. 

Compared  to  this  importance  sampling  approach,  the  algorithm  introduced  in  Sec¬ 
tion  3.4  requires  a  total  of  0(md  log???.)  samples  to  estimate  all  values  of  Nt .  However, 
the  output  of  the  algorithm  allows  us  to  estimate  the  cross-layer  reliability  accurately 
for  all  values  of  p.  Note  that  the  majority  of  the  computation  is  allocated  to  estimate 
the  values  of  N,  where  i  is  close  to  d.  In  particular,  similar  to  importance  sampling, 
the  algorithm  requires  0(md)  samples  to  compute  the  value  of  Nd,  by  enumerating  all 
0(md)  possible  network  states  with  d  fibers.  In  this  regard,  both  algorithms  require  a 
similar  amount  of  computation  to  obtain  a  good  estimate  of  Nd,  in  order  to  accurately 
estimate  the  cross-layer  reliability  when  p  is  small. 

In  IS,  since  the  value  of  p  is  chosen  to  optimize  for  small  p  ,  the  relative  error  on 
the  reliability  estimate  for  large  p  can  be  large  if  the  same  set  of  samples  is  used.  For 
instance,  when  p  =  the  variance  of  the  estimator  Varp>(U')  is  given  by: 


118 


Vary  (U' )  =  Ep(U')  -  (. Ep(U)f  ,  by  Equation  (3.9) 
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In  other  words,  if  the  samples  are  collected  with  link  failure  probability  p  ~ , 
the  algorithm  IS  will  require  at  least  0((^)m  3)  samples  in  order  to  approximate  the 
cross-layer  reliability  at  p  =  to  a  constant  relative  error.  Therefore,  to  efficiently 
estimate  the  cross-layer  reliability  accurately  for  all  values  of  p,  the  algorithm  IS  needs 
to  be  extended  to  collect  samples  at  various  link  failure  probabilities  p  .  In  that  case, 
the  sampling  plan  will  become  quite  similar  to  the  algorithm  in  Section  3.4,  which 
explicitly  controls  the  collection  of  network  states  with  different  sizes  by  sampling 
network  states  of  each  size  separately. 
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Chapter  4 


Optimal  Reliability  Conditions  for 
Lightpath  Routings 

4.1  Introduction 

In  the  previous  chapter,  we  have  defined  the  cross-layer  reliability  to  quantify  network 
survivability  under  random  physical  failures;  and  developed  an  algorithm  to  estimate 
the  cross-layer  reliability  function.  This  allows  us  to  assess  the  reliability  of  a  layered 
network  under  different  link  failure  probabilities.  One  important  observation  we  made 
is  that  a  lightpath  routing  that  is  good  at  one  failure  probability  may  not  perform 
as  well  as  other  lightpath  routings  under  a  different  failure  probability.  As  such, 
optimal  lightpath  routings  under  different  failure  probabilities  may  have  different 
characteristics. 

The  goal  of  this  chapter  is  to  study  the  relationship  between  the  link  failure 
probability,  the  cross-layer  reliability  and  the  structure  of  a  layered  network.  The 
understanding  of  such  will  shed  light  on  desirable  properties  for  a  reliable  layered 
network  in  different  failure  probability  regimes.  The  key  to  our  study  is  the  cross-layer 
failure  polynomial  introduced  in  Chapter  3.  The  coefficients  of  the  polynomial  contain 
the  structural  information  about  the  cross-layer  topology  and  lightpath  routing.  The 
study  of  the  polynomial  allows  us  to  formulate  the  optimality  condition  and  provides 
important  insights  on  how  lightpath  routing  should  be  designed  for  better  reliability, 
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which  will  be  the  focus  of  Chapter  5. 

This  chapter  is  organized  as  follows.  In  Section  4.2  we  discuss  the  previous  work 
on  designing  reliable  single-layer  networks  under  random  link  failures,  and  discuss  the 
applicability  of  these  results  to  our  multi-layer  models.  We  will  review  our  network 
and  failure  model  in  Section  4.3,  and  discuss  some  concepts  that  are  important  to 
our  study  in  the  following  sections.  In  Section  4.4,  we  identify  the  conditions  for 
optimal  lightpath  routings  in  different  failure  probability  regimes.  Namely,  in  the 
low  probability  regime,  maximizing  the  min  cut  of  the  (layered)  network  maximizes 
reliability,  whereas  in  the  high  probability  regime,  minimizing  the  spanning  tree  of  the 
network  maximizes  reliability.  The  results  from  Section  4.4  are  extended  in  Section 
4.5,  in  which  additional  information  about  the  layered  network  is  taken  into  account 
in  the  analysis,  which  leads  to  a  stronger  result  that  unifies  the  results  in  the  previous 
sections.  Finally,  in  Section  4.6,  we  carry  out  empirical  studies  to  examine  various 
attributes  of  lightpath  routings  optimized  for  the  different  failure  probability  regimes, 
as  well  as  compare  the  bounds  developed  in  Section  4.5  with  the  actual  values. 


4.2  Related  Work 

The  problem  of  designing  reliable  networks  has  been  studied  rather  extensively  in 
the  single-layer  setting.  In  the  single-layer  network  design  problem,  the  goal  is  to 
construct  the  most  reliable  graph  topology,  given  the  number  of  nodes  and  the  number 
of  edges.  An  important  concept  here  is  that  of  uniformly  optimally  reliable  (UOR) 
graph ;  a  graph  is  uniformly  optimally  reliable  if  for  all  the  values  of  link  failure 
probability  it  yields  the  best  reliability  among  the  graphs  using  the  same  numbers 
of  nodes  and  edges.  The  work  in  [21, 116]  studied  the  conditions  for  a  UOR  graph 
to  exist.  However,  a  UOR  graph  does  not  always  exist  [79],  and  hence,  it  is  also 
important  to  study  locally  optimally  reliable  (LOR)  graphs.  In  [16],  the  authors 
characterized  the  class  of  LOR  graphs  for  different  failure  probability  regimes.  More 
details  on  the  class  of  UOR  graphs  and  LOR  graphs  can  be  found  in  [9, 10,20,80]. 

The  reliable  network  design  problem  in  a  layered  setting  consists  of  three  compo- 
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nents:  logical  topology  design,  physical  topology  design,  and  lightpath  routing  design. 
In  layered  networks,  careful  design  of  the  physical  and  logical  topologies  alone  does 
not  immediately  translate  to  high  reliability,  as  the  lightpath  routing  also  plays  a  cru¬ 
cial  role.  In  this  chapter,  we  focus  on  reliable  lightpath  routing  design  assuming  that 
the  logical  and  physical  topologies  are  given.  As  we  will  see  in  the  following  sections, 
some  of  the  important  insights  behind  reliable  topology  design  in  the  single-layer  can 
be  adopted  to  our  lightpath  routing  design  problem. 


4.3  Failure  Polynomial  and  Connectivity  Parameters 

We  consider  the  same  network  and  failure  model  as  in  Chapter  3,  where  a  layered 
network  consists  of  the  logical  topology  Gp  —  (VL,UL)  built  on  top  of  the  physical 
topology  Gp  —  (Vp,  TSp)  through  a  lightpath  routing.  The  number  of  physical  links 
\Ep\  is  denoted  by  til  and  each  physical  link  fails  independently  with  probability 
p.  When  a  physical  link  fails,  all  logical  links  that  use  the  physical  link  also  fail. 
The  reliability  of  the  layered  network  is  defined  to  be  the  probability  that  the  logical 
toplogy  remains  connected. 

Recall  that  the  reliability  of  the  lightpath  routing  can  be  expressed  as  the  failure 
polynomial  (Section  3.3.1): 

771 

F(p)  =  £A'P<(1-p)’”~'-  (4.1) 

1=0 

Each  coefficient  N.t_  represents  the  number  of  cross-layer  cuts  of  size  i  in  the  net¬ 
work.  Define  a  Min  Cross  Layer  Cut  (MCLC)  as  a  smallest  set  of  physical  links 
needed  to  disconnect  the  logical  network.  Denote  by  d  the  size  of  MCLC,  then  d  is 
the  smallest  i  such  that  .V,  >  0,  meaning  that  the  logical  network  will  not  be  discon¬ 
nected  by  fewer  than  d  physical  link  failures.  As  discussed  in  Chapter  2,  the  MCLC 
is  a  generalization  of  single-layer  min-cut  to  the  multi-layer  setting. 

Define  a  Max  Cross  Layer  Non- Cut  (MCLNC)  as  a  largest  set  of  physical  links 
whose  failure  would  not  disconnect  the  logical  network.  Denote  by  c  the  size  of 
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MCLNC,  then  c  is  the  maximum  number  of  fiber  failures  that  the  logical  network  can 
possibly  survive.  Since  N,  <  ("'),  by  definition,  c  is  the  largest  i  such  that  N,  <  ("'), 
and  we  have  N,  =  (r'1) ,  Vz  >  c,  meaning  that  more  than  c  failures  would  always 
disconnect  the  logical  network. 

The  Cross  Layer  Non-Cuts  are  closely  related  to  the  Cross-Layer  Spanning  Trees, 
defined  in  Section  2.2.3  as  a  minimal  set  of  fibers  whose  survival  keeps  the  logical 
network  connected.  Hence,  if  T  is  a  cross-layer  spanning  tree,  then  the  survival  of 
just  T  \  {(i,  j)}  renders  the  logical  network  disconnected  for  any  fiber  (i,  j)  E  T. 
Note  that  this  is  a  generalization  of  the  single-layer  spanning  tree.  However,  unlike 
a  single-layer  graph  where  all  spanning  trees  have  the  same  size,  in  a  layered  graph, 
spanning  trees  can  have  different  sizes.  Thus,  we  define  a  Min  Cross  Layer  Spanning 
Tree  (MCLST)  as  a  cross-layer  spanning  tree  with  minimum  number  of  physical  links. 

Each  Max  Cross  Layer  Non-Cut  corresponds  to  a  Min  Cross  Layer  Spanning  Tree, 
and  vice  versa.  That  is,  for  an  MCLNC  S ,  EP\S  is  an  MCLST  because  the  survival 
of  EP  \  S  keeps  the  logical  network  connected,  yet  the  removal  of  any  additional 
link  would  disconnect  the  network.  Consequently,  the  value  b  —  m  —  c  is  the  size 
of  Min  Cross  Layer  Spanning  Tree  (MCLST),  and  any  result  with  MCLNC  directly 
translates  into  a  result  with  MCLST,  and  vice  versa.  In  the  following,  we  will  use 
both  terms  interchangeably. 

Note  that  for  given  logical  and  physical  topologies,  MCLC  and  MCLST  are  all 
determined  by  the  lightpath  routing.  Consider  again  the  examples  in  Figure  3-1. 
The  disjoint  routing  in  Figure  3-1  (c),  which  has  better  reliability  for  small  p,  has 
d  —  2  and  6  =  3.  On  the  other  hand,  the  shortest  routing  in  Figure  3- 1(d),  which 
has  better  reliability  for  large  p,  has  d  —  1  and  b  =  2.  Furthermore,  the  optimal 
routing  in  Figure  3-1  (e)  has  d  =  2  and  6  =  2.  This  example  suggests  that  maximizing 
MCLC  may  lead  to  better  reliability  for  small  p,  while  minimizing  MCLST  may  lead 
to  better  reliability  for  large  p.  It  turns  out  that  this  is  true  in  general,  and  this  will 
be  further  discussed  in  Section  4.4. 
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4.4  Properties  of  Optimal  Lightpath  Routings 


Based  on  the  failure  polynomial  of  a  lightpath  routing,  and  its  associated  connectivity 
parameters,  one  can  develop  inights  into  optimal  lightpath  routing  under  different 
probability  regimes.  In  Section  3.8  we  have  mentioned  that  a  lightpath  routing  with 
a  higher  MCLC  value  will  have  higher  reliability  for  sufficiently  small  link  failure 
probability  p.  In  this  section,  we  will  discuss  in  greater  details  the  optimal  lightpath 
routings  in  different  failure  probability  regimes. 

4.4.1  Uniformly  and  Locally  Optimal  Lightpath  Routings 

We  start  with  a  discussion  of  routings  that  are  most  reliable  for  all  failure  probabilities. 
The  observations  in  this  section  will  motivate  a  local  (in  p)  optimization  approach  to 
the  design  of  lightpath  routing,  which  is  relatively  easy  compared  with  an  optimization 
over  all  the  values  of  p.  We  begin  with  the  following  definition: 

Definition  4.1  For  given  logical  and  physical  topologies,  a  lightpath  routing  is  said 
to  be  uniformly  optimal  if  its  reliability  is  greater  than  or  equal  to  that  of  any  other 
lightpath  routing  for  every  value  of  p. 

Therefore,  a  uniformly  optimal  lightpath  routing  yields  the  best  reliability  for  any 
value  of  p  G  [0, 1].  Based  on  the  failure  polynomial  of  a  lightpath  routing,  one  can 
immediately  develop  a  sufficient  condition  for  a  uniformly  optimal  lightpath  routing: 

Theorem  4.1  Given  a  lightpath  routing  R,  let  Nf  be  the  number  of  cross-layer  cuts 
with  size  i.  Then  R  is  a  uniformly  optimal  lightpath  routing  if,  for  any  other  lightpath 
routing  R  .  A'/*’  <  Nf  for  all  i  g  {(!...  . ,  m},  where  m  is  the  number  of  physical  links. 


Proof.  The  unreliability  for  the  lightpath  routings  R  and  R'  are  given  by: 

m 
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m 
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respectively.  It  follows  that: 


in  m 

£  N,Rp‘(  1  -  p)"-'  -  £  A'/VU  -  p)m- 

i= 0  7-0 
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<0, 

for  any  p  6  [0, 1],  which  implies  that  the  reliability  for  R  is  always  no  less  than  any 
other  lightpath  routings.  □ 

The  existence  of  a  uniformly  optimal  lightpath  routing  depends  on  the  logical 
and  physical  topologies.  For  example,  the  lightpath  routing  shown  in  Figure  3-1  (e) 
is  uniformly  optimal  for  the  topologies  in  Figure  3-1.  In  contrast,  Figure  4-1  shows 
two  different  lightpath  routings  that  are  optimal  when  p  is  sufficiently  small  and 
sufficiently  large,  respectively.  In  this  case,  there  is  no  single  lightpath  routing  which 
yields  the  highest  reliability  regardless  of  the  link  failure  probability.  Also  note  that 
in  Figure  4-1  (a),  all  the  logical  links  are  routed  with  physically  disjoint  paths  that 
also  happen  to  be  physically  shortest  paths.  Therefore,  a  lightpath  routing  that  uses 
both  physically  shortest  and  disjoint  paths  does  not  guarantee  uniform  optimality  in 
general.  However,  we  conjecture  that  the  following  special  class  of  single-hop  lightpath 
routing  is  uniformly  optimal: 

Conjecture  1  Given  a  physical  topology  Gp  —  (Vp,FP),  and  logical  topology  Gp  — 
(Vl,  ~El)  where  Gp  C  EP,  the  single-hop  lightpath  routing,  where  each  logical  link  ( s.t ) 
takes  on  the  physical  fiber  ( s,t )  as  its  physical  route,  is  uniformly  optimal. 

Since  uniformly  optimal  lightpath  routings  are  not  always  attainable,  this  moti¬ 
vates  us  to  focus  on  non-uniformly  (or  locally)  optimal  routings,  where  the  probability 
regime  of  optimality  is  restricted  to  a  subrange  within  [0, 1],  A  locally  optimal  light¬ 
path  routing  is  defined  as  follows: 

Definition  4.2  For  given  logical  and  physical  topologies,  a  lightpath  routing  is  said 
to  be  locally  optimal  if  there  exists  0  <  a  <  b  <  1,  such  that  its  reliability  is  greater 
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(LOW) 


(b)  Optimal  Routing  in  High  Regime 
(HIGH) 


Link  Failure  Probability  (p) 

(c)  Unreliability  of  the  two  Lightpath  Routings 


Figure  4-1:  Example  showing  that  a  uniformly  optimal  routing  does  not  always  exist.  Physical 
topology  is  in  solid  line,  logical  topology  is  the  triangle  formed  by  the  3  corner  nodes  and  3  edges, 
and  lightpath  routing  is  in  dashed  line. 


than  or  equal  to  that  of  any  other  lightpath  routing  for  every  value  of  p  £  [a,/;].  In 
addiiton,  the  interval  [a,  b ]  is  called  the  optimality  regime  for  the  lightpath  routing. 

Note  that  a  uniformly  optimal  lightpath  routing  is  also  locally  optimal  with  op¬ 
timality  regime  [0. 1],  Theorem  4.2  below  is  a  crucial  result  to  this  study;  namely,  it 
reveals  a  connection  between  local  optimality  and  uniform  optimality. 

Theorem  4.2  Consider  a  pair  of  logical  and  physical  topologies  (GL:GP)  for  which 
there  exists  a  uniformly  optimal  lightpath  routing.  Then,  any  locally  optimal  lightpath 
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routing  for  (Gj^Gp)  is  also  uniformly  optimal. 


Proof.  Denote  by  F*(p)  the  failure  polynomial  of  a  uniformly  optimal  lightpath  rout¬ 
ing.  By  definition,  F*(p)  is  no  greater  than  any  other  failure  polynomial  for  p  £  [0, 1]. 
Consider  a  locally  optimal  lightpath  routing  L,  and  let  FL(p)  be  its  failure  polynomial. 
Let  [p-\ ,  P‘2 ]  be  the  interval  over  which  the  routing  L  is  optimal. 

The  polynomial  equation  FL(p)  —  F*(p )  =  0  has  degree  at  most  rri  and  thus  has 
at  most  m  roots  unless  the  polynomial  FL(p)  —  F*(p)  is  trivially  zero.  However, 
by  the  definitions  of  local  optimality  and  uniform  optimality,  the  equation  has  an 
infinite  number  of  solutions  over  the  interval  [pi,p2]-  Consequently,  FL(p)  is  identical 
to  F*(p ),  which  implies  that  lightpath  routing  L  is  also  uniformly  optimal.  □ 

Motivated  by  this  result,  we  study  locally  optimal  lightpath  routings.  In  particu¬ 
lar,  we  develop  the  conditions  for  a  lightpath  routing  to  be  optimal  for  both  the  low 
failure  probability  regime  (small  p)  and  high  failure  probability  regime  (large  p). 

4.4.2  Low  Failure  Probability  Regime 

It  is  easy  to  see  that  in  the  failure  polynomial,  the  terms  corresponding  to  small  cross¬ 
layer  cuts  dominate  when  p  is  small.  Hence,  for  reliability  maximization  in  the  low 
failure  probability  regime,  it  is  desirable  to  minimize  the  number  of  small  cross-layer 
cuts.  We  use  this  intuition  to  derive  the  properties  of  optimal  routings  for  small  p. 
We  begin  with  the  following  definition: 

Definition  4.3  Consider  two  lightpath  routings  1  and  2.  Routing  1  is  said  to  be 
more  reliable  than  routing  2  in  the  low  failure  probability  regime  if  there  exists  a 
positive  number  p0  such  that  the  reliability  of  routing  1  is  higher  than  that  of  routing 
2  for  0  <  p  <  Pq.  A  lightpath  routing  is  said  to  be  locally  optimal  in  the  low  failure 
probability  regime  if  it  is  more  (or  equally)  reliable  than  any  other  routing  in  the  low 
failure  probability  regime. 

Let  dj  be  the  size  of  the  MCLC  under  routing  j(—  1,  2).  Let  N,  and  M,  be  the 
numbers  of  cross-layer  cuts  of  size  i  under  routings  1  and  2  respectively.  We  call  the 


128 


vector  N  =  [JVj.Vi]  the  cut  vector.  The  following  is  an  example  of  cut  vectors  N  and 
M  with  di  —  4  and  d-2  —  3: 

i  0  1  2  3  4  5  •  •  •  m 

Nt  0  0  0  0  20  26  •••  1 

Mi  0  0  0  9  19  30  1. 

Using  cut  vectors  of  lightpath  routings,  we  define  lexicographical  ordering  as  follows: 

Definition  4.4  Routing  1  is  lexicographically  smaller  than  routing  2  if  Nc  <  Mc 
where  c  is  the  smallest  i  at  which  Ni  and  Mi  differ. 

In  the  above  example,  we  have  c  =  3  and  Nc  <  Mc,  hence  routing  1  is  lexicographically 
smaller.  Therefore,  if  a  lightpath  routing  is  lexicographically  smaller  than  another,  it 
has  fewer  small  cross-layer  cuts  and  thus  yields  better  reliability  for  small  p. 

Theorem  4.3  Given  two  lightpath  routings  1  and  2  with  cut  vectors  [i\\i  =  0. ... ,  rn] 
and  [Mi\i  —  0, . . . ,  m]  respectively,  where  rn  is  the  number  of  physical  links,  if  routing  1 
is  lexicographically  smaller  than  routing  2,  then  routing  1  is  more  reliable  than  routing 
2  in  the  low  failure  probability  regime.  In  particular,  let  c.  —  min  {i  :  Mt  N^}  be  the 

0  <i<rn 

index  where  the  elements  in  the  cut  vectors  first  differ.  There  exists  po  > 
such  that  lightpath  routing  1  is  more  reliable  than  routing  2  for  p  <  p0. 

Proof.  This  is  implied  by  Theorem  4.11,  which  will  be  proved  in  Section  4.5.  □ 

Clearly,  Theorem  4.3  leads  to  a  local  optimality  condition;  that  is,  if  a  lightpath 
routing  minimizes  the  cut  vector  lexicographically,  then  it  is  locally  optimal  in  the 
low  failure  probability  regime.  An  interesting  case  is  when  routing  1  has  larger  MCLC 
than  routing  2  (as  in  the  above  example).  In  this  case,  routing  1  is  lexicographically 
smaller  than  routing  2  and  implies  Theorem  3.11,  which  we  restate  here  as  a  corollary: 

Corollary  4.4  If  d\  >  d2,  then  routing  1  is  more  reliable  than  routing  2  in  the  low 
failure  probability  regime. 
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Consequently,  a  lightpath  routing  with  the  maximum  size  MCLC  yields  the  best 
reliability  for  small  p.  Similarly,  routing  1  is  also  lexicographically  smaller  than 
routing  2  when  they  have  the  same  size  of  MCLC  but  routing  1  has  fewer  MCLCs. 
This  leads  to  the  following  result: 

Corollary  4.5  If  d\  =  d2  and  AT  <  M,i, ,  then  routing  1  is  more  reliable  than  routing 
2  in  the  low  probability  regime. 

The  expression  for  p0  given  in  Theorem  4.3  also  provides  some  insight  into  how 
the  difference  of  the  cut  vectors  affects  the  guaranteed  regime.  For  example,  if  c  is 
small  and  Mc  —  Nc  is  large,  the  guaranteed  regime  is  larger.  In  other  words,  if  one 
lightpath  routing  has  fewer  small  cross-layer  cuts  than  the  other,  it  will  achieve  higher 
reliability  for  a  larger  range  of  p  in  the  low  probability  regime. 

Therefore,  for  reliability  maximization  in  the  low  failure  probability  regime,  it  is 
desirable  to  maximize  the  size  of  the  MCLC  while  minimizing  the  number  of  such 
MCLCs.  This  condition  will  be  used  to  develop  lightpath  routing  algorithms  in  Chap¬ 
ter  5. 

Finally,  Theorem  4.3  also  implies  that  all  lightpath  routings  that  are  locally  opti¬ 
mal  in  the  low  failure  probability  regime  have  the  same  failure  polynomial.  In  other 
words,  from  the  reliability  standpoint,  all  locally  optimal  lightpath  routings  in  the 
low  failure  probability  regime  are  equivalent. 


Corollary  4.6  Let  A  and  B  be  two  different  locally  optimal  lightpath  routings  in  the 
low  failure  probability  regime.  Then  the  reliability  of  the  two  lightpath  routings  are 
identical,  for  all  link  failure  probability  p. 

Proof.  We  show  that  the  failure  polynomials  of  the  two  lightpath  routings  are  identi¬ 
cal.  Suppose  the  failure  polynomials  are  different.  Then  one  of  the  lightpath  routings 
is  lexicographically  smaller  than  the  other.  Therefore,  one  of  them  cannot  be  locally 
optimal  in  the  low  failure  probability  regime.  □ 
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4.4.3  High  Failure  Probability  Regime 

We  have  seen  that  when  p  is  small,  it  is  important  to  minimize  the  number  of  small 
cuts.  Analogously,  for  large  p.  large  cuts  are  dominant,  and  hence,  minimizing  the 
number  of  large  cuts  would  result  in  maximum  reliability.  In  other  words,  the  cut 
vector  should  be  minimized  for  large  cuts  for  better  reliability  in  the  high  failure 
probability  regime.  Similar  to  the  case  of  low  probability  regime,  we  define  the 
following: 

Definition  4.5  Consider  two  lightpath  routings  1  and  2.  Routing  1  is  said  to  be  more 
reliable  than  routing  2  in  the  high  failure  probability  regime  if  there  exists  a  number 
p0  <  1  such  that  the  reliability  of  routing  1  is  higher  than  that  of  routing  2  for  pQ  <  p. 

An  important  parameter  in  this  case  is  the  Max  Cross  Layer  Non-Cut  (MCLNC), 
because  logical  networks  with  large  MCLNC  may  remain  connected  even  if  only  a 
small  number  of  physical  links  survive.  For  high  failure  probability  regime,  the  colex- 
icographical  ordering  of  the  lightpath  routings  can  be  used  to  compare  reliability  per¬ 
formance.  A  cut  vector  [Ni\i  =  0. . . . .  m\  is  colexicographically  smaller  than  another 
cut  vector  [Mi\i  =  0, . . . ,  m]  if  and  only  if  the  vector  [Nm-i\i  =  0, . . . ,  rri]  is  lexico¬ 
graphically  smaller  than  =  0, . . . ,  m\.  In  other  words,  rather  than  based  on 

the  first  element  in  the  vectors  that  differ,  the  colexicographical  ordering  is  based 
on  the  last  element  in  the  vectors  that  differ.  Therefore,  if  a  lightpath  routing  has 
a  larger  MCLNC,  it  is  also  colexicographically  smaller.  The  following  theorem  is  a 
similar  result  to  Theorem  4.3. 


Theorem  4.7  Given  two  lightpath  routings  1  and  2  with  cut  vectors  [Ar(|?'  =  0, . . . ,  m] 
and  [. Mj\i  =  0, . . . ,  m]  respectively,  where  rn  is  the  number  of  physical  links,  if  routing 
1  is  colexicographically  smaller  than  routing  2,  then  routing  1  is  more  reliable  than 
routing  2  in  the  high  failure  probability  regime.  In  particular,  let  c  —  max  {i  :  Af,  / 

0<i<m 

Nj}  be  the  index  where  the  dements  in  the  cut  vectors  last  differ.  There  exists  po  < 
1  _  (m-c+i)(A/c-A'c)  tiiat  lightpath  routing  1  is  more  reliable  than  routing  2  for 

V  >  Po- 
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Proof.  This  is  implied  by  Theorem  4.13,  which  will  be  proved  in  Section  4.5.  □ 

Let  Cj  be  the  size  of  MCLNC  for  routing  j{=  1,2).  We  can  develop  the  following 
corollaries  similar  to  the  low  regime  case: 

Corollary  4.8  If  C\  >  c2,  then  routing  1  is  more  reliable  than  routing  2  in  the  high 
failure  probability  regime. 

Corollary  4.9  If  C\  =  c2  and  NCl  <  Mc,2,  then  routing  1  is  more  reliable  than  routing 
2  in  the  high  failure  probability  regime. 

Corollary  4.10  Let  A  and  B  be  two  different  locally  optimal  lightpath  routings  in 
the  high  failure  probability  regime.  Then  the  reliability  of  the  two  lightpath  routings 
are  identical,  for  all  link  failure  probability  p. 

Therefore,  for  reliability  maximization  in  the  high  failure  probability  regime,  it  is 
desirable  to  find  a  lightpath  routing  that  maximizes  the  size  of  MCLNC  (or  equiv¬ 
alently,  minimizes  the  size  of  MCLST)  and  minimizes  the  number  of  MCLNCs  (or 
maximizes  the  number  of  MCLST).  This  observation  is  similar  to  the  single-layer 
setting  where  maximizing  the  number  of  spanning  trees  maximizes  the  reliability  for 
large  p  [16l-  The  major  difference  in  the  multi-layer  case  is  that,  since  spanning  trees 
may  have  different  sizes,  minimizing  the  size  of  the  Min  Cross-Layer  Spanning  Tree 
becomes  the  primary  objective.  As  shown  in  Section  2.2.3,  computing  the  size  of 
the  MCLST  is  NP-hard.  Therefore,  designing  a  lightpath  routing  that  minimizes  the 
MCLST  is  likely  to  be  a  difficult  problem.  In  Appendix  4.8.1,  we  present  an  ILP  that 
formulates  the  survivable  lightpath  routing  problem  with  an  objective  to  minimize 
the  MCLST. 

4.5  Extension  of  Probability  Regimes 

In  the  previous  sections  we  have  shown  that  a  lightpath  routing  with  a  cut  vector 
that  is  lexicographically  (or  colexicographically)  smaller  will  have  a  higher  reliability 
when  link  failure  probability  is  sufficiently  small  (or  high).  However,  the  guaranteed 
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regimes  established  in  Theorems  4.3  and  4.7  are  usually  rather  conservative,  since 
the  expressions  only  consider  the  the  first  element  in  the  two  cut  vectors  that  are 
different.  For  instance,  the  expression  fails  to  capture  the  uniform  optimality  for  a 
lightpath  routing  that  satisfies  the  condition  in  Theorem  4.1.  In  this  section,  we  will 
develop  a  more  general  expression  for  the  regime  bounds  that  includes  other  elements 
in  the  cut  vectors. 

Consider  two  lightpath  routings  1  and  2.  Let  Ffip)  be  the  failure  polynomial  of 
routing,;'  (=  1,  2),  and  iVj’s  and  Mf  s  be  the  coefficients  in  Fffp)  and  F2(p)  respectively. 
Define  the  following  two  vectors  of  partial  sums: 


k 

Y^Ni\k  =  0,  .:,  m 

.  i= o 


and  N 


in 

^2  Ni\k  —  0, ...,  m 

j  in  k 


The  vectors  M  and  M  are  defined  similarly.  Note  that  the  i-th  element  iV,  of  vector 
N  is  the  total  number  of  cross-layer  cuts  of  size  at  most  i.  Likewise,  iV,-  is  the  total 
number  of  cross-layer  cuts  of  size  at  least  i.  We  will  use  these  vectors  to  develop 
the  conditions  that  incrementally  include  larger  cuts  and  thus  extend  the  probability 
regime  where  one  lightpath  routing  is  more  reliable  than  any  other.  We  first  extend 
the  defintion  of  lexicographical  ordering  as  follows: 

Definition  4.6  Lightpath  routing  1  is  said  to  be  k -lexicographically  smaller  than  light¬ 
path  routing  2  if 


k  =  max  |  j  :  <  Mt,  Vi  <  d  +  j  j  and  k  >  1, 

where  d  is  the  position  of  first  element  where  the  two  cut  vectors  differ. 

Therefore,  a  lightpath  routing  is  lexicographically  smaller  (in  the  original  sense)  if 
and  only  if  it  is  b-lexicographically  smaller  for  some  k  >  1.  The  fc-lexicographical  or¬ 
dering  thus  compares  two  lightpath  routings  based  on  structures  beyond  the  smallest 
cuts,  making  it  possible  to  establish  a  larger  optimality  regime.  Roughly  speaking, 
the  value  of  k  reflects  the  degree  of  dominance  of  a  lightpath  routing  in  the  low  prob¬ 
ability  regime:  a  ^-lexicographically  smaller  lightpath  routing  means  that  it  has  fewer 
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‘‘small”  cuts,  where  the  definition  for  “small”  is  broader  if  k  is  larger. 

Similarly,  for  the  high  failure  probability  regime,  the  colexicographical  ordering 
defined  in  Section  4.4.3  can  be  extended  to  compare  cuts  beyond  only  the  largest 
cuts: 


Definition  4.7  Lightpath  routing  1  is  said  to  be  k-colexicographically  smaller  than 
lightpath  routing  2  if 

k  =  max | j  :  N i  <  M ,,  Mi  >  c  —  j  j  and  k  >  1, 
where  c  is  the  position  of  last  element  where  the  two  cut  vectors  differ. 


In  contrast  to  the  fc-lexicographical  ordering,  this  colexicographical  ordering  starts 
from  the  largest  cuts,  and  incrementally  includes  the  smaller  cuts. 

It  is  obvious  that  when  p  <  0.5,  the  failure  probability  of  a  cross-layer  cut  is  a 
non-increasing  function  of  the  cut  size,  because  p'{ l„<*  p)m~l  >  p'+1(l  —  p)m_h+1)  for 
p  <  0.5.  Suppose  that  routing  1  has  smaller  total  number  of  cuts  of  size  up  to  i  than 
routing  2,  i.e.,  Nt  <  M,;.  To  compare  cross-layer  cuts  of  size  at  most  i  +  1,  suppose 
further  that  the  relative  increment  Ni+i  —  M;+1  in  the  number  of  larger  cuts  does 
not  exceed  the  surplus  Af.;  —  N{  from  smaller  cuts,  i.e.,  Ni+l  <  M,+1.  Then,  with 
respect  to  cut  size  at  most  i  +  1,  routing  1  will  have  smaller  failure  probability  than 
routing  2,  provided  that  the  same  was  true  for  cut  size  up  to  i.  This  observation 
leads  to  the  following  theorem  on  the  relationship  between  lexicographical  ordering 
and  probability  regime. 


Theorem  4.11  Given  two  vectors  N ~{Ni\i  —  0, . . . .  m]  and  M  [M,\/  =  0, . . . ,  m  . 


For  any  j,  let  A  j  =  —  TV*)  and  S  }  = 

i=0 

k -lexicographically  smaller  than  M,  then: 


max 


If  the  vector  N  is 


V  A'.p'U  -  p) 

i= 0 


<  Mip\ 1  -  P) 

i-  0 
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for  p  <  p\,  —  min 


0.5,  max  B. 

d<j<d+k— 1 


,  where  d  =  min  {/*  :  N,  <  Mf]  and: 


0.5. 


Bj  = 


j+ 


T+  5  j  (/+i)/Aj  ’ 


if  j  =  m 
,  otherwise. 


Proof.  See  Appendix  4.8.2. 


□ 


Therefore,  the  probability  regime  in  Theorem  4.11  is  a  non-decreasing  function  of 
k,  which  means  that  a  lightpath  routing  with  smaller  number  of  cuts  over  a  larger  size 
range  will  be  guaranteed  to  be  more  reliable  over  a  larger  regime.  This  is  consistent 
with  the  conclusion  in  Section  4.4.2,  that  the  lightpath  routing  design  should  minimize 
the  lexicographical  ordering  of  the  cut  vector. 

Theorem  4.3  is  a  direct  result  from  Theorem  4.11.  For  a  lexicographically  smaller 
lightpath  routing,  the  term  Bd  in  Theorem  4.11  is  given  by: 


1  _ _ 1 _ 

l+i  +  ”  TFT  +  -  N") 

>  (d+l){Md-Nd) 

~  m(Md-Nd)  +  (d+  l)(d^)’ 
^  (d  +  l)(Md  —  Nfi) 


(d+l)(Md-Nci) 

2m(d) 


since  5  <  1 


An  interesting  special  case  is  when  d  +  k  —  1  =  m,  that  is,  M .  >  N :l  for  all 
j  =  0, . . . ,  m.  In  that  case,  the  term  Bd+k^x  =  Bm  =  0.5,  implying  that  the  optimality 
regime  is  [0,0.5].  We  summarize  this  as  the  following  corollary: 


Corollary  4.12  If  N j  <  Mj  for  all  j  =  0, . . . ,  m,  then  lightpath  routing  1  is  at  least 
as  reliable  as  lightpath  routing  2  for  p  <  0.5,  i.e.,  Flip)  <  F2{p)  for  p  <  0.5. 


Note  that  the  condition  in  Corollary  4.12  requires  every  partial  sum  in  the  vector 
M  to  be  at  least  the  corresponding  partial  sum  in  the  vector  N,  which  is  a  much 
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stronger  condition  than  the  lexicographic  comparison  in  Theorem  4.3.  This  stronger 
condition  allows  the  better  optimality  regime  to  be  established  in  Corollary  4.12. 

For  the  high  failure  probability  regime,  the  result  is  similar  to  Theorem  4.11: 

Theorem  4.13  Given  two  vectors  N=[A^|i  =  0, . . . ,  m]  and  M=[Mj|z  =  0, . . . ,  m}. 
For  any  j,  let  A7  =  J2  (M  -  K)  and  5j  =  max  <  1.  If  N  is  k- 

i=m—j  0<i<rri—  j—l  ^  v  ?.  /  J 

colexicographically  smaller  than  M;  then: 

m  m 

Y  NiPi{  i  -  pt -i  <  y  M*pl( 1  -  pt~\ 

1=0  1=0 

for  p  >  pQ  —  1  —  max  <  0.5,  min  Cj  >,  where  c  =  min  {/  :  iVm= i  <  MmA  and: 

1  c<j<c+fc— 1  J 


0.5, 
1  - 


3Ti+S  i(j+i)/Aj’ 


if  j  =  m 
otherwise. 


Proof.  The  proof  for  Theorem  4.13  is  based  on  Theorem  4.11  and  the  symmetry 
between  the  ^-lexicographical  and  fc-colexicographical  orderings.  See  Appendix  4.8.3 
for  details.  □ 


The  following  corollary  is  analogous  to  Corollary  4.12  for  the  high  failure  regime: 

Corollary  4.14  If  N  j  <  M:j  for  all  j  =  0 , ...  ,m,  then  routing  1  is  at  least  as  reliable 
as  routing  2  for  p  >  0.5,  be.,  F1(p)  <  F2(p)  for  p  >  0.5. 

Finally,  combining  Corollaries  4.12  and  4.14,  this  gives  us  a  condition  for  uniformly 
optimal  lightpath  routing: 

Corollary  4.15  If  N;)  <  M  3  and  Nj  <  M:i  for  all  j  =  0,  ...pm,  then  lightpath 
routing  1  is  uniformly  optimal. 

Theorems  4.11  and  4.13  unify  Theorems  4.1,  4.3,  and  4.7  to  provide  a  single 
optimality  regime  expression  for  lightpath  routings  that  exhibit  different  degrees  of 
dominance.  Note  that  the  conditions  of  (co)lexicographical  ordering  in  Corollaries 
4.12  and  4.14  are  satisfied  by  the  uniform  optimality  condition  N,  <  M;.  v'i  discussed 
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in  Theorem  4.1.  Therefore,  this  unified  theorem  allows  for  a  broader  class  of  uniformly 
optimal  lightpath  routings. 


4.6  Empirical  Studies 

In  this  section,  we  conduct  empirical  studies  to  verify  the  results  presented  in  the 
previous  sections.  In  Section  4.6.1,  we  study  two  sets  of  lightpath  routings,  optimized 
for  the  low  and  high  failure  probability  regimes  respectively,  and  compare  their  various 
attributes,  in  order  to  illustrate  the  structural  difference  between  optimal  lightpath 
routings  for  different  regimes.  We  will  also  compare  their  reliability  performance  over 
the  link  failure  probability  regime  [0, 1],  In  Section  4.6.2,  we  compare  the  optimality 
regimes  among  the  two  sets  of  lightpath  routings,  and  evaluate  the  tightness  of  the 
bounds  given  by  Theorems  4.11  and  4.13. 

All  simulations  in  this  section  are  based  on  the  augmented  NSFNET  (Figure  4-2) 
with  14  nodes  and  29  links  as  the  physical  topology,  and  350  random  logical  topologies 
with  size  ranging  from  6  to  12  nodes  and  connectivity  at  least  4.  For  our  study  of 
lightpath  routings,  we  use  the  ILP-based  rerouting  algorithm  that  we  will  present 
in  Section  5.1  to  generate  a  set  of  lightpath  routings,  called  LPRLow,  that  are  optimized 
for  the  low  regime.  Similarly,  we  use  the  formulation  MCLST  in  Appendix  4.8.1  to 
generate  a  set  of  lightpath  routings,  called  LPRHigh,  optimized  for  the  high  failure 
regime. 


Figure  4-2:  The  augmented  NSFNET. 
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4.6.1  Lightpath  Routings  Optimized  for  Different  Probability 
Regimes 

We  first  compare  the  structures  of  lightpath  routings  that  are  optimized  for  different 
failure  probability  regimes.  Figures  4-3(a),  4-3(b)  and  4-3(c)  show  the  average  values 
of  MCLC,  MCLST  and  the  number  of  physical  hops  in  the  lightpaths  for  the  two 
sets  of  lightpath  routings  LPRLow  and  LPRHigh-  For  lightpath  routings  optimized  for 
the  high  failure  regime,  the  focus  is  to  minimize  the  size  of  the  minimum  cross-layer 
spanning  tree  (MCLST),  so  it  is  not  surprising  that  the  size  of  the  MCLST  for  LPRHigh 
is  consistently  smaller.  As  a  side  effect,  minimizing  the  size  of  the  MCLST  often  leads 
to  shorter  physical  paths  for  the  logical  links,  so  the  average  number  of  physical  hops 
for  the  logical  links  is  consistently  smaller  for  LPRHjgh  as  well.  On  the  other  hand, 
the  key  to  optimizing  reliability  for  low  failure  regime  is  to  maximize  the  MCLC,  for 
which  the  lightpath  routings  in  LPRLow  are  able  to  achieve  better.  Overall,  there  are 
noticeable  differences  in  the  structures  between  the  two  sets,  suggesting  that  the  two 
objectives  can  lead  to  vastly  different  lightpath  routings. 

In  terms  of  reliability,  this  means  that  uniformly  optimal  lightpath  routings  may 
not  always  exist.  In  Figure  4-4,  the  survivability,  both  in  terms  of  reliability  and 
unreliability  (i.e.,  1  -  reliability),  of  the  pair  over  different  link  failure  probabilities  is 
shown.  As  expected,  when  the  link  failure  probability  is  small,  the  lightpath  routings 
in  LPRLow  achieve  higher  reliability.  In  particular,  when  the  link  failure  probability 
approaches  0,  there  is  an  order  of  magnitude  difference  in  terms  of  unreliability, 
meaning  that  maximizing  the  size  of  MCLC  can  have  significant  impact  in  the  network 
reliability.  As  the  link  failure  probability  increases,  it  becomes  more  important  to 
minimize  the  size  of  MCLST,  so  LPRHigh  is  able  to  achieve  higher  reliability  in  that 
regime. 

Another  interesting  observation  from  the  figure  is  that  the  difference  in  reliability 
is  less  prominent  in  the  high  failure  probability  regime.  This  is  partly  because  the 
algorithm  used  to  generate  lightpath  routings  in  LPRLow,  which  tries  to  maximize 
the  MCLC  as  well  as  minimize  the  number  of  MCLCs,  is  more  sophisticated  than 
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(a)  Min  Cross  Layer  Cut 


(c)  Average  Number  of  Physical  Hops 


Figure  4-3:  Lightpath  routings  optimized  for  different  probability  regimes  have 
for  MCLST  'ghtPath  r°UtingS  optirnized  for  MCLC>  and  LPRHigh  are  lightpath 


different 

routings 


properties. 

optimized 
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Figure  4-4:  Reliability  (or  Unreliability)  of  lightpath  routings  optimized  for  different  probability 
regimes. 

the  algorithm  used  to  generate  the  lightpath  routings  in  LPRHigh-  In  addition,  since 
the  size  of  a  MCLC  is  usually  smaller  than  the  size  of  a  MCLST,  the  contribution 
of  an  MCLC  to  the  unreliability  in  the  low  failure  regime  is  generally  greater  than 
the  contribution  of  a  MCLST  in  the  high  failure  regime.  Therefore,  the  difference  in 
reliability  tends  to  be  greater  in  the  low  failure  probability  regime. 

In  practical  settings,  the  failure  probability  of  individual  physical  links  is  typically 
very  small.  Therefore,  our  simulation  result  suggests  that  minimizing  the  lexico¬ 
graphic  ordering  of  the  lightpath  routings  can  often  lead  to  meaningful  improvement 

in  network  survivability. 

4.6.2  Bounds  on  Optimality  Regimes 

Next,  we  evaluate  the  bounds  on  optimality  regimes,  pl0  and  pft,  given  by  Theorems 
4.11  and  4.13.  For  each  pair  of  physical  and  logical  topologies,  we  consider  the 
corresponding  lightpath  routings  in  LPRLow  and  LPRHigh-  The  values  of  p0  and  p0 
given  by  the  theorems  are  compared  with  the  actual  crossing  points  of  the  failure 
polynomials,  that  is,  the  points  where  the  (co)lexicographically  smaller  lightpath 
routings  start  to  have  lower  reliability. 


140 


Bound  (Pq1) 


(a)  Low  Failure  Regime 


05  0.6  0.7  0.8  0.9 

Bound  (p0h) 


(b)  High  Failure  Regime 


Figure  4-5:  Tightness  of  optimality  regime  bound.  Each  data  point  corresponds  to  the  bound  given 
Hghtptdh  routings  "  *  aCtUal  P°int  °f  the  reliabili^  P^nomials  of  the  two 


Each  comparison  corresponds  to  a  data  point  in  Figures  4-5(a)  and  4-5(b),  which 
plot  the  computed  bounds  against  the  actual  crossing  points  for  the  two  failure 
regimes.  Since  the  bounds  given  by  theorems  are  at  most  0.5,  for  illustrative  purpose 
the  actual  crossing  points  are  also  capped  at  0.5. 

In  the  low  failure  probability  regime,  there  is  a  strong  correlation  between  the 
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value  of  Pq  and  the  actual  crossing  point,  suggesting  that  the  bound  provides  a  strong 
signal  about  the  dominance  of  the  lexicographically  smaller  lightpath  routing  in  the 
low  failure  probability  regime. 

On  the  other  hand,  the  correlation  between  the  value  of  and  the  actual  crossing 
point  is  not  as  prominent  in  the  high  failure  regime,  meaning  that  the  bounds  are  not 
as  tight  in  this  case.  One  possible  explanation  for  this  asymmetry  is  the  difference 
in  effectiveness  between  the  algorithms  used  to  generate  the  lightpath  routings  in 
h P R Low  and  LPRHigh-  As  discussed  before,  the  algorithm  used  to  generate  the  light¬ 
path  routings  in  LPRLow  is  more  sophisticated,  and  is  able  to  generate  solutions  that 
are  closer  to  the  optimal.  As  a  result,  the  lightpath  routings  in  LPRLow  generally 
exhibit  a  stronger  dominance  in  the  low  failure  probability  regime,  which  results  in 
tighter  bounds  given  by  Theorem  4.11.  On  the  other  hand,  the  lightpath  routings  in 
LPRHigh  are  less  dominant  in  the  high  failure  regime,  which  results  in  weaker  bounds 
given  by  Theorem  4.13.  This  is  confirmed  by  Figure  4-6,  which  shows  the  distribu¬ 
tion  of  k  in  the  k- (co) lexicographical  ordering  comparisons.  Excluding  the  instances 
with  total  dominance,  about  25%  of  the  lightpath  routings  in  LPRHigh  are  only  1- 
colexicographically  smaller  than  their  counterparts.  In  contrast,  all  the  lightpath 
routings  in  LPRlow  are  at  least  4-lexicographically  smaller  than  their  counterparts,  so 
the  bounds  are  tighter  in  general. 

4.7  Conclusion 

In  this  chapter,  we  study  the  relationship  between  the  link  failure  probability,  the 
cross-layer  reliability  and  the  structure  of  a  layered  network.  The  key  to  this  study 
is  the  polynomial  expression  for  reliability  which  relates  structural  properties  of  the 
network  graph  and  the  lightpath  routing  to  the  reliability.  Using  this  polynomial,  we 
show  that  reliable  routings  depend  on  the  link  failure  probability,  and  identify  opti¬ 
mality  conditions  for  reliability  maximization  in  different  failure  probability  regimes. 
In  particular,  we  show  that  a  lightpath  routing  with  the  maximum  size  of  Min  Cross 
Layer  Cuts  (MCLC)  and  the  minimum  number  of  MCLCs  is  most  reliable  in  the  low 
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Figure  4-6: 
dominate  in 


Histogram  of  k  in  Ar-(co)lexicographicaJ  ordering  comparisons.  Lightpath  routings  that 
every  partial  sum  are  put  into  the  k  =  30  bucket. 


failure  probability  regime.  On  the  other  hand,  in  the  high  failure  probability  regime, 
a  routing  with  the  minimum  size  of  Min  Cross  Layer  Spanning  Tree  (MCLST)  and 
the  maximum  number  of  MCLSTs  maximizes  reliability.  This  observation  provides 

useful  insights  for  designing  reliable  layered  networks,  which  we  will  focus  on  in  the 
next  chapter. 


4.8  Chapter  Appendix 

4.8.1  Lightpath  Routing  ILP  to  Minimize  Minimum  Cross 
Layer  Spanning  Tree  (MCLST)  Size 


As  discussed  in  Section  4.4.3,  lightpath  routings  with  smaller  MCLST  size  will  be 
more  reliable  in  the  high  failure  probability  regime.  In  this  section,  we  present  an  ILP 
for  the  lightpath  routing  formulation  that  minimizes  the  MCLST.  This  ILP  is  used 
in  Section  4.6  to  generate  the  set  of  lightpath  routings,  LPRHigh,  that  are  optimized 
for  the  high  failure  probability  regime.  We  first  define  the  following  variables: 


{fij  I(a  0  e  El,  ( i,j )  e  EP}:  Flow  variables  representing  the  lightpath  routing. 
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•  {Vijlihj)  G  Ep }:  1  if  fiber  (i,j)  survives,  0  otherwise. 

•  {2st|(s,t)  G  EL}:  1  if  lightpath  (s.t)  survives,  0  otherwise. 

•  (:rs<|(s,f)  G  EL}\  Flow  variables  oil  the  logical  topology. 


MCLST  : 

Minimize 

subject  to: 

(i,j)€'Ep 

y>*‘- 

E  xls 

-  ( |V‘I 

if  s  —  0 

(4.2) 

tevL 

tevL 

^  —  1,  if  ,s 

g  14  -  {0} 

(Vl  - 

-  1)  ■  2st  >  XSt, 

V(.s ,t)  G  El 

(4.3) 

IV 

o 

G 

+  f$~ 

i  v(s,  t)  e  el 

Mi,j)  G  Ep 

(4.4) 

{{i-j)  :  fij  =  1}  forms  an  (s.t)-path  in  Gp,  V(s,f)  G  EL 
0  <  Vij  <  1;  0  <  Zijjg£{ 0,1} 

The  variables  represent  a  flow  on  the  logical  topology  where  1  unit  of  flow  is  sent 
from  logical  node  0  to  every  other  logical  node,  as  described  by  Constraint  (4.2).  Con¬ 
straint  (4.3)  requires  these  flows  to  be  carried  only  on  the  surviving  logical  links,  which 
implies  that  the  surviving  links  form  a  connected  logical  subgraph.  Constraint  (4.4) 
ensures  the  survival  of  physical  links  that  are  used  by  any  surviving  logical  links. 
Since  the  objective  function  minimizes  yi:h  the  optimal  solution  will  repre- 

(i.j)eEp 

sent  a  minimum  set  of  physical  links  whose  survival  will  allow  the  logical  link  to  be 
connected. 

Therefore,  the  set  of  physical  links  (i,j)  with  y.tJ  =  1  forms  a  cross-layer  spanning 
tree.  As  a  result,  the  optimal  solution  to  the  above  ILP  yields  a  lightpath  routing 
that  minimizes  the  size  of  the  MCLST. 

4.8.2  Proof  of  Theorem  4.11 

Theorem  4.11:  Given  two  vectors  N=[A^|4  =  0, . . . ,  m\  and  M=[Mj|z  =  0, . . . ,  m}. 

For  any  j,  let  Aj  -  £(.V,  -  Nt)  and  ~t 3  =  max  {  Ni~^u  1 .  If  the  vector  N  is 

i= o  /  •  )?'<”'  {  1 ; )  J 
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^-lexicographically  smaller  than  M,  then: 


£a>'(i  -pP- 
?:=0 


i= 0 


for  p  <  pl0  —  min  <  0.5,  max 

I  d<j<d-\-k~- 1 


,  where  d  —  min{<i  :  Nd  <  Md}  and: 


Bi 


0.5, 


i 


if  j  =  m 
otherwise. 


Proof.  We  first  prove  the  following  lemma. 


Lemma  4.16  If  vector  N  is  k- lexicographically  smaller  than  vector  M,  then  for  all 
j  <  d  +  k  —  1,  where  d  =  min  {d  :  Nd  <  Md}: 

j 

-  Ni)pl(  1  -  p)m~r  >  V(1  -  p)m~J,  for  0  <p<  0.5.  (4.5) 

?:= o 

Proof.  We  prove,  by  induction  on  j.  that  (4.5)  holds  for  all  j  <  d  +  k  --  1 .  First,  if 

J  =  o, 


-  N,)p*V  -  pr-  =  (Mo  -  iV„)(l  -  p)”‘ 

i= 0 

=  Ao(l  -p)m-P 

Therefore,  (4.5)  holds  for  j  =  0.  Now  suppose  (4.5)  holds  for  all  i  <  j  for  some 
j  <  d  +  k  —  1.  Then,  we  have: 
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j+ 1 

^(m,  -  jv,)P‘(  i  -  Pr~‘ 

i= 0 
3 

=  £(M,  -  ^)p'(l  +  (M/+1  -  JVJ+,)P>+1(1  ^p)"‘-0+D 

i=0 

>  —  p)m~y  +  (Mj+ 1  —  iVj+1)]7;+1(l  —  p)m_(j+1),  by  induction  hypothesis 

>  V+1(l  -  p)m“°'+1)  +  (My+i  -  ATJ+iy+1(l  -  p)m^+1),  since  ^  <  1 
=  A,+1^+I(l-P)m^u+1)- 

Therefore,  by  induction,  (4.5)  is  true  for  all  j  <  k.  □ 

Lemma  4.17  Given  a  fixed  k,  if  A,  >  0  for  all  i  <  d  +  k  —  l,  then  for  any  d<J< 
d  +  k  —  1 : 


Fi(p)  <  F2(p ), 


for  0  <  p  <  min  {0.5,  Bj},  where: 


0.5. 


Bj  = 


y+ 


r+ 5 .,(/;,)/ at 


if  j  =  rn 
,  otherwise. 


Proof.  First,  note  that  by  definition  of  6  :n  for  any  i  >  j: 


m, 


'  %  ' 


If  k  =  m  —  d  +  1,  then  Lemma  4.16  implies  that,  for  p  <  0.5: 


(4.6) 


-  iV,)p’(l  -  vT >  A„,p" 

>  0. 


4=0 


Therefore,  the  lemma  is  true  for  k  =  m  -  d  +  1.  Now  suppose  k  <  m  -  d  +  1.  If 
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S  j  <  0  for  some  j  <  k ,  this  implies  for  any  d+k  <  l  <  m: 


Ad+t-i  +  (Mi  —  Ni) 


i—d.+h 


>  A 


d+k  —  1 


*  i  T-C) 

i-d+k  v  7 


by  Equation  (4.6) 


>  0. 


This  last  inequality  is  due  to  the  fact  that  S  j  <0,  and  that  >  0,  since  N 

is  ^-lexicographically  smaller  than  M.  Therefore,  in  this  case,  the  vector  N  is  also 
(m  —  d  +  1  )-lexicographically  smaller  than  M,  and  the  lemma  is  true  as  proved  above. 
Therefore,  in  the  rest  of  the  proof,  we  assume  that  5  :/  >  0. 

Since  p  <  0.5  and  A,  >  0  for  all  %  <  d  +  k  —  1,  by  Lemma  4.16  we  have,  for  all 
j  <  d+  k  —  1: 

jS.l/,  -  AV!/A1  -  p)m  '  >  V(1  -  //)'"  j .  (4.7) 

i=0 

Next,  we  will  use  the  following  result  to  bound  the  tail  probability  of  the  Binomial 
distribution: 


Lemma  4.18  For  r  >  mp, 


pY 


-r  .  rU  ~  P) 
r  —  rnp 


Proof.  See  [40]. 


□ 


Therefore,  since  p  < 


- =5-4 — t — =r-  <  — ,  by  Lemma  4.18,  we  have: 

_i_  a  .  |  \ /  A  m  7  J 


Pj+1(i-Pyn  (J+1) 


(j  +  i)(i  -  p ) 

j  +  1  —  mp 


U  + 1);> 

j  I  1  —  mp 


(4.8) 
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In  addition,  since  p  < 


7TT^(A)/A 


we  have: 


It  follows  that: 


(i  + 1  )p  i 


j  I  •  <»!>  \-fti 


< 


7>(jd 

A, 

A, 

H") 


m  _  m 

j+1  j+1 


(4.9) 


Y,(m.  -  n,)?(i  -  Py“- 

i- 0 

j  m 

-  inw  -  p)"-'  +  E  w  -  - »)’ 


i=0 

j 


i=j+ 1 


> 


-  P)W"'  -  E  M V C1  - P)w_i  by  Equation  (4.6) 


i—  0  z=j+l  \  / 

>A7y(i 

—  4  j  ^  ^  -  [>)'"  ■'  ■  .  {'-~y  * by  Equations  (4.7)  and  (4.8) 

=js(i -pr-t-fJli-f  m  V 

\  8  i  V  +  V  J  ’  1  ">P 


A, 


>7/(1  •  //S'"  5  ,  |  ^  -  f  m  .. 

'  1  <5,  V7  +  1;  ^(-l} 


by  Equation  (4.9) 


=0. 


□ 


As  a  result  of  Lemma  4.17,  we  can  pick  the  d  <  j  <  d  +  k  —  1  such  that  Bt  is 
maximized  to  obtain  the  largest  upper  bound  for  p,  and  Theorem  4.11  follows.  □ 
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4.8.3  Proof  of  Theorem  4.13 


Theorem  4.13:  Given  two  vectors  N=[iVi|i  =  0, . . . . m]  and  M  [A/,  / 

^ 771  4. — 

For  any  ji  let  A,  =  V  (M,  —  Nj )  and  S  j  —  max 

i—m—j  0<i<m-j 

colexicographically  smaller  than  M,  then: 


=  0, . . . ,  m] . 
If  N  is  k- 


i=Q 


<^A/,P'(i-Pr-, 

i= o 


for  p  >  Pq  =  1  —  max  <0.5,  min  Cj 

\  c<j<c-\-k—l 


,  where  c  =  min  {i  :  Nm-i  <  Mm_j}  and: 


Cj  - 


0.5, 


1  - 


,h  +  M/T>)/V 


if  j  —  m  ' 
otherwise. 


Proof.  Let  N-  =  Nm^  and  M-  =  Mm-i,  for  i  =  0, . . . ,  m;  and  let  N'k  =  Y^=o  N'i  ancl 
M'k  —  Yl=o  ■  h  follows  that  the  vector 


N  :=  \N, \i  =  0, 


.  ml 


is  ^-lexicographically  smaller  than  the  vector 

A?  :=  [M-|i  =  0, . . . ,  m]. 

By  Theorem  4.11, 

m  m 

Jjaa  -  (i  -  pf1  -  ~  - ,lY"  '•  where  ?  -  5  p 

i=0  J=0 

>  0, 
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for  q  <  min  <  0.5.  max  B,  >,  where: 

'  I  d<j<d+k- 1  I  ' 


j+i  +  ^  J  (.:  •  !  ^  / 


if  j  =  m 
7.  otherwise; 


In  the  above  expression,  we  have: 


V.  \  N,  -  M 

o  j  —  max  <  — — - 

J  j+l<i<m  \  (™) 


Note  that  Bj  —  1  —  Ct  for  d  <  j  <  d  +  k  —  1.  Therefore,  lightpath  routing  1  is  at 
least  as  reliable  as  lightpath  routing  2  for 


P  =1  -  Q 


>1— min  <0.5,  max  B, 

(  d<j<d+k— 1 


max  <  0.5,  min  1  —  B; 

I  d<j<d.+k— 1  J 


=  max  <  0.5,  min  C, 

[  d<j<d+k— 1 
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Chapter  5 


Algorithms  to  Improve  Reliability  in 
Layered  Networks 


In  the  previous  chapter,  we  have  shown  that  when  physical  link  failures  are  rare, 
the  lightpath  routing  that  minimizes  the  lexicographical  ordering  will  maximize  the 
cross-layer  reliability.  We  have  proposed  a  number  of  survivable  lightpath  routing 
heuristics  in  Chapter  2  where  the  objective  is  to  maximize  the  MCLC.  Since  a  light¬ 
path  routing  with  a  larger  MCLC  value  is  lexicographically  smaller,  these  algorithms 
can  be  considered  as  the  first  step  towards  maximizing  the  cross-layer  reliability  under 
the  low  failure  probability  regime.  In  this  chapter,  we  continue  in  this  direction  to 
develop  algorithms  that  not  only  maximize  the  MCLC,  but  also  minimize  the  number 
of  MCLCs. 

All  algorithms  developed  in  this  chapter  follow  a  common  iterative  pattern,  where 
“local”  changes  are  incrementally  applied  to  the  given  layered  network  to  improve  its 
cross-layer  reliability.  In  each  iteration,  some  preprocessing  is  performed  to  construct 
the  set  of  MCLCs  in  the  network,  and  a  local  change  to  the  network  is  applied  such 
that  at  least  some  of  these  MCLCs  will  be  eliminated  after  the  change.  The  process 
is  repeated  until  no  further  improvement  can  be  found,  in  which  case  the  lightpath 
routing  reaches  a  local  optimum  lexicographically. 

We  will  consider  two  different  approaches  under  this  framework.  In  Section  5.1,  we 
will  first  study  the  lightpath  rerouting  method  in  which  an  iteration  involves  changing 
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the  physical  route  of  an  existing  lightpath.  By  rerouting  lightpaths  in  the  network, 
one  can  possibly  improve  the  reliability  of  a  layered  network  without  changing  the 
physical  and  logical  topologies.  We  will  formulate  the  lightpath  rerouting  as  an  op¬ 
timization  problem,  where  the  objective  is  to  find  best  way  to  reroute  a  lightpath  so 
that  the  reliability  improvement  is  maximized.  In  Section  5.1.2,  we  will  develop  an 
ILP  to  find  the  optimal  lightpath  to  reroute.  In  Section  5.1.3,  we  will  propose  an 
approximation  algorithm  that  can  compute  a  near-optimal  solution  in  a  much  shorter 
time.  Simulation  results  on  these  algorithms  will  be  presented  in  Section  5.1.4. 

Conceivably,  one  can  further  improve  the  reliability  of  the  network  by  adding 
logical  links  to  the  network.  Therefore,  in  Section  5.2,  we  will  consider  logical  topology 
augmentation  to  improve  the  reliability  of  a  layered  network.  By  iteratively  adding 
logical  links  to  a  network,  one  can  eliminate  some  of  the  existing  MCLCs  of  the 
network,  thereby  reducing  the  number  of  MCLCs,  or  potentially  increasing  the  size  of 
the  MCLC.  We  will  formulate  the  augmentation  as  an  optimization  problem,  where 
the  objective  is  to  find  the  placement  of  the  new  logical  link  that  will  eliminate 
the  largest  number  of  MCLCs.  Similar  to  the  rerouting  problem,  an  ILP  and  an 
approximation  algorithm  will  be  presented.  In  addition,  in  Section  5.2.5,  we  develop 
a  lower  bound  on  the  minimum  number  of  additional  logical  links  required  to  increase 
the  MCLC  value  of  the  layered  network.  We  will  use  this  lower  bound  to  evaluate 
the  effectiveness  of  our  incremental  augmentation  algorithm. 

Finally,  to  conclude  this  chapter,  in  Section  5.3  we  will  carry  out  a  case  study 
on  a  real-world  IP-over- WDM  network.  We  will  apply  different  techniques  developed 
throughout  this  thesis,  including  survivable  lightpath  routing,  lightpath  rerouting 
and  logical  topology  augmentation  to  study  the  reliability  gain  achieved  by  these 
techniques  in  a  real  world  setting. 

5.1  Lightpath  Rerouting 

Given  an  existing  lightpath  routing  of  a  layered  network,  the  lightpath  rerouting 
method  involves  changing  the  physical  route  of  certain  logical  links  in  order  to  reduce 
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the  number  of  small  cross-layer  cuts  in  the  network.  Figure  5-1  shows  a  simple  exam¬ 
ple  of  how  rerouting  can  eliminate  small  cuts.  In  the  figure,  the  solid  lines  depicts 
the  physical  topology  and  the  dashed  lines  depicts  the  logical  topology.  Initially,  the 
Min  Cross  Layer  Cut  size  of  the  lightpath  routing  is  1  and  there  are  three  cross-layer 
cuts  of  this  size.  The  logical  links  are  then  rerouted  sequentially  so  that  the  network 
reliability  is  incrementally  improved.  At  the  end,  the  MCLC  value  of  the  lightpath 
routing  is  increased  to  2. 


Figure  5-1:  Improving  reliability  via  lightpath  rerouting.  The  physical  topology  is  in  solid  lines,  and 
the  lightpath  routing  of  the  logical  topology  is  in  dashed  lines.  The  MCLC  value  and  the  number 
of  MCLCs  in  the  lightpath  routings  are  denoted  by  d  and  Nd. 

Generally  speaking,  the  rerouting  framework  can  be  described  as  follows.  Given 
any  initial  lightpath  routing, 

(1)  Select  a  logical  link,  say  (s.  t),  and  reroute  (s,  t)  to  reduce  the  number  of  MCLCs. 

(2)  Repeat  (1)  until  no  further  improvement  is  possible. 

Therefore,  each  iteration  will  reduce  the  number  of  MCLCs,  and  possibly  increase 
the  size  of  the  MCLC  if  every  MCLC  is  converted  into  a  non-cut.  When  the  rerouting 
terminates,  the  final  lightpath  routing  is  locally  optimal ,  in  the  sense  that  no  further 
improvement  is  possible  by  rerouting  a  single  lightpath. 
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Figure  5-2:  The  lightpath  rerouting  framework. 

In  Chapter  2,  we  presented  several  formulations  for  routing  the  logical  links  jointly 
to  maximize  the  MCLC.  The  lightpath  rerouting  framework  provides  an  alternative 
approach  for  designing  survivable  lightpath  routings.  Instead  of  solving  the  formula¬ 
tions  that  jointly  route  the  logical  links,  we  can  construct  an  initial  lightpath  routing 
using  a  fast  algorithm  such  as  the  shortest  path  routing,  and  then  iteratively  apply 
rerouting  until  the  lightpath  routing  reaches  a  local  optimum.  Since  each  iteration 
computes  a  physical  route  for  only  one  logical  link,  this  approach  effectively  breaks 
down  the  joint  lightpath  routing  problem  into  multiple  smaller  steps,  which  helps 
improve  the  overall  running  time.  As  we  will  see  in  Section  5.1.4,  this  rerouting  ap¬ 
proach  is  very  effective  in  obtaining  lightpath  routings  with  better  reliability  than  the 
formulations  in  Chapter  2. 

5.1.1  Effects  of  Rerouting  a  Lightpath 

Suppose  that  an  initial  lightpath  routing  is  given,  and  let  d  be  the  size  of  the  MCLC 
under  the  initial  routing.  When  the  physical  route  of  a  logical  link  changes,  some 
of  the  cross-layer  cuts  will  be  converted  into  non-cuts,  and  some  non-cuts  will  be 
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converted  into  cross-layer  cuts.  In  the  low  failure  probability  regime,  the  reliability 
will  be  improved  by  the  rerouting  if  the  following  is  true: 

1.  The  conversion  of  cross-layer  cuts  with  size  d  to  non-cuts  outnumbers  the  con¬ 
version  in  the  opposite  direction. 

2.  The  MCLC  value  does  not  decrease. 

Therefore,  we  can  formulate  the  lightpath  rerouting  as  an  optimization  problem 
to  maximize  the  reduction  in  the  number  of  MCLCs,  subject  to  the  constraint  that  no 
non-cuts  of  size  smaller  than  d  is  converted  to  cross-layer  cuts.  Here  we  will  formulate 
such  a  reduction  in  the  number  of  MCLCs  by  a  lightpath  rerouting,  which  will  be 
used  as  the  basis  of  the  ILP  formulation. 

Given  the  physical  topology  GP  =  (VP,  EP)  and  the  logical  topology  GL  = 
(Vl ,El),  we  model  a  lightpath  routing  as  a  set  of  binary  constants  {///},  where 
f?j  =  1  if  and  only  if  logical  link  (s,t)  uses  physical  link  (i,j)  in  the  lightpath  rout¬ 
ing.  For  a  given  set  of  physical  links  S,  we  define  the  logical  residual  graph  for  S, 
denoted  as  Gf,  to  be  {s.  t)  £  EL  :  ftj  =  0  \  In  other  words,  the  residual  graph 

[  (i.j)es  J 

consists  of  logical  links  that  use  none  of  the  physical  links  in  S.  By  definition,  the  set 
S'  is  a  cross-layer  cut  if  and  only  if  its  logical  residual  graph  is  disconnected.  Given  a 
cross-layer  cut  S,  it  is  called  a  k-way  cross-layer  cut  if  its  logical  residual  graph  has 
k  connected  components.  In  addition,  given  a  cross-layer  non-cut  T  for  a  lightpath 
routing,  we  call  a  logical  link  (s.t)  critical  to  T  if  (s,t)  is  a  cut  edge  of  the  residual 
graph  G[,  that  is,  it  is  an  edge  in  GTL  whose  removal  will  disconnect  the  residual 
graph. 

The  following  theorems  describe  the  conditions  for  a  lightpath  rerouting  that 
results  in  conversions  between  cross-layer  cuts  and  non-cuts. 

Theorem  5.1  Let  S  be  a  cross-layer  cut  for  a  lightpath  routing.  Rerouting  logical  link 
(s,  t )  from  physical  path  Pi  to  P2  turns  S  into  a  non-cut  if  and  only  if  the  following 
conditions  are  true: 

1.  S  is  a  2-way  cross-layer  cut. 
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2.  s  and  t  are  disconnected  in  the  residual  graph  for  S. 

3.  /  2  does  not  use  any  physical  links  in  S. 

Proof.  Let  Gf  and  Gf'  be  the  residual  logical  graphs  for  S  under  the  original  and 
new  lightpath  routings  respectively.  First,  suppose  all  the  above  conditions  are  true. 
Since  S  is  a  2-way  cross-layer  cut  under  the  original  lightpath  routing,  the  logical 
residual  graph  Gf  consists  of  2  connected  components,  each  of  which  contains  one 
of  s  and  t.  All  logical  links  that  are  in  Gf  will  remain  in  Gf ' ,  because  none  of  their 
physical  routes  have  changed.  In  addition,  since  the  new  route  P2  does  not  use  any 
physical  links  in  S,  the  logical  link  (s,  t)  will  be  present  in  Gf ,  making  Gf  connected. 
This  implies  S  becomes  a  non-cut  under  the  new  lightpath  routing. 

Conversely,  if  S'  is  a  fc-way  cross-layer  cut  with  k  >  2,  or  s,  t  belong  to  the  same 
connected  component  in  Gf,  rerouting  (s,t)  will  not  connect  the  logical  residual 
graph,  so  S  remains  a  cross-layer  cut.  In  addition,  if  P2  uses  some  physical  link  in 
S,  (, s,t )  will  not  be  present  in  the  new  residual  graph  Gf7,  so  Gf  =  Gf,  which  also 
implies  S  remains  a  cross-layer  cut.  □ 

Theorem  5.2  Let  T  be  a  cross-layer  non-cut  for  a  lightpath  routing.  Rerouting  log¬ 
ical  link  ($yt)  from  physical  path  P\  to  P2  turns  T  into  a  cross-layer  cut  if  and  only 
if  the  following  conditions  are  true: 

1.  ( s,t )  is  critical  to  T. 

2.  P2  uses  some  physical  link  in  T. 

Proof.  Let  Gf  and  Gf'  be  the  residual  logical  graphs  for  T  under  the  original  and 
new  lightpath  routings  respectively.  First,  suppose  both  conditions  are  true.  Since  P2 
uses  some  physical  link  in  T,  the  logical  link  will  be  removed  from  Gf  under  the  new 
lightpath  routing.  Since  (.s.  t)  is  critical  to  the  non-cut  T,  its  removal  will  disconnect 
the  residual  graph,  which  means  that  T  will  become  a  cross-layer  cut. 

Conversely,  suppose  any  of  the  conditions  are  false.  In  this  case,  the  logical  residual 
graph  Gf  will  remain  connected  after  rerouting  logical  link  (sjf.  So  T  remains  a 
non-cut.  □ 
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Therefore,  the  optimal  rerouting  should  maximize  the  number  of  cross-layer  cuts 
satisfying  Theorem  5.1  and  minimize  the  number  of  non-cuts  satisfying  Theorem  5.2. 
However,  it  is  also  important  to  ensure  that  none  of  the  non-cuts  with  size  smaller 
than  d  is  converted  to  cross-layer  cuts  by  the  rerouting,  since  otherwise  the  MCLC 
value  will  decrease.  The  following  theorem  states  that  only  non-cuts  with  size  at  least 
d  —  1  can  be  converted  into  a  cross-layer  cut  by  rerouting  a  single  lightpath. 

Theorem  5.3  Let  d  be  the  Min  Cross  Layer  Cut  value  of  a  lightpath  routing  and  let 
JVC  be  the  set  of  cross-layer  non- cuts  that  can  be  converted  into  cross-layer  cuts  by 
rerouting  a  single  logical  link.  Then  |7’|  >  d  —  1  for  all  T  E  JVC. 

Proof.  Suppose  JVC  contains  a  convertible  non-cut  T  with  size  less  than  d—  1.  Since 
T  is  convertible  by  rerouting  a  single  logical  link,  by  Theorem  5.2,  there  exists  a 
logical  link  (s,  t)  that  is  critical  to  T.  Now  let  l  be  any  physical  link  used  by  (s,t), 
then  the  set  of  physical  links  T  U  {/}  would  disconnect  the  logical  residual  graph  and 
is  therefore  a  cross-layer  cut.  However,  such  a  set  contains  at  most  d  —  1  physical 
links,  contradicting  that  d  is  the  Min  Cross  Layer  Cut.  □ 

Therefore,  when  rerouting  a  lightpath,  we  need  to  make  sure  that  none  of  the  non¬ 
cuts  with  size  d  ■  1  get  converted  into  cuts  in  order  to  prevent  the  MCLC  value  from 
decreasing.  Based  on  these  observations,  we  next  develop  an  ILP  for  the  lightpath 
rerouting  problem. 

5.1.2  ILP  for  Lightpath  Rerouting 

Let  ( Vp,EP )  and  (VL,EL)  be  the  physical  and  logical  topologies.  For  the  given 
lightpath  routing,  let  d  be  the  MCLC  value,  and  let  Cd-JVC ^  and  JVC^i  be  the  sets 
of  2-way  cross-layer  cuts  with  size  d ,  non-cuts  with  size  d,  and  non-cuts  with  size 
d  —  1  respectively.  The  lightpath  rerouting  problem  can  be  formulated  as  an  ILP  that 
finds  the  logical  link,  and  its  new  physical  route,  that  maximizes  the  net  reduction  in 
MCLCs. 

The  ILP  can  be  considered  as  a  path  selection  problem  on  an  auxiliary  graph 
G'p  —  (Vp.  Ep ),  where  V’P  =  VpU{w,  w},  with  u  and  v  being  the  additional  source  and 
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sink  nodes  in  the  auxiliary  graph;  and  E'P  =  EP  U  {(u,  x),  (x,  v)  :  x  G  VP}.  Figure  5-3 
illustrates  the  construction  of  the  auxiliary  graph. 


Figure  5-3: 
sink  nodes, 


Construction  of  the  auxiliary  graph  for  the  ILP.  u  and  v  are  the  additional  source  and 
and  the  dashed  lines  are  the  additional  links  in  the  auxiliary  graph. 


We  first  define  the  following  variables  and  parameters: 


1.  Variables: 


•  {9st  ■  ( s,t )  €  El}:  1  if  logical  link  (s,t)  is  rerouted,  and  0  otherwise. 

•  {fij  ■  (i,j)  e  E’p}-.  Flow  variables  describing  a  path  in  G'  from  node  u  to 
node  v. 

•  {; yc  :  c  G  C,i}:  1  if  the  cross-layer  cut  c  is  converted  into  a  non-cut  by  the 
lightpath  rerouting,  and  0  otherwise. 

•  {zc  :  c  G  MCd}:  1  if  the  non-cut  c  is  converted  into  a  cross-layer  cut  by  the 
lightpath  rerouting,  and  0  otherwise. 

2.  Parameters: 


•  {h%  :  c  G  Cd,  ( s,t )  G  El}:  1  if  logical  nodes  s  and  t  are  disconnected  by 
the  2-way  cut  c,  and  0  otherwise. 

•  {tfs*  •  r  C  MCa  U  J^fCd—  j ,  (s,  t)  G  Ei}:  1  if  logical  link  (s,  t)  is  critical  to  the 
non-cut  c,  and  0  otherwise. 

•  {l-ij  .  Vr  G  CAJj\lCd  UVCrf-i,  (i.j)  G  EP):  1  if  physical  link  (i,j)  is  in  set 
c,  and  0  otherwise. 
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The  lightpath  rerouting  can  be  formulated  as  follows: 


REROUTE  :  Maximize  zc.  subject  to: 

cGC,(  c€jVC,i 

9 at  <  ( fus  +  .f !<■)/-■  V(.s,  f)  G  E’l  (5-1) 

£  s.,  =  i  (5.2) 

tO)€E,, 

+  XI  q^9st  -  Vc  G  (b  j)  G  /••/'  (5.3) 

(s,t.)e0L 

lijfij  +  X  <  -c  +  1,  Vc  G  A/j 'Cd,  (h  j)  G  Ep  (5.4) 

{■■i.l  jC  /•'/. 

yc  <  X  Vc  6  (5.5) 

(s,1)€El 

yc  <  1  -  n3Uv  V(i,  j)  G  Vc  g  Cd  (5.6) 

{(*,  j)  :  f.jj  =  1}  forms  an  (u.n)-path  in  G'  (5.7) 

fij-,  9 st  £  {A  1}  )  0  <  r/c.  <  1 


The  formulation  can  be  interpreted  as  a  path  selection  problem  on  the  auxiliary 
graph  G'P.  Constraint  (5.7),  which  requires  that  the  variables  fVJ  describe  a  path 
from  u  to  u,  can  be  expressed  by  the  standard  flow  conservation  constraints.  As  a 
result,  in  a  feasible  solution  to  the  formulation,  the  variables  ftj  represent  a  path 
u  —>  s^t  — >■  v,  which  corresponds  to  the  new  physical  route  for  the  logical  link  (s,  t) 
after  the  rerouting. 

Constraint  (5.1)  ensures  that  gsl  can  be  set  to  1  only  if  /r;  represents  the  path 
u  — »■  s  ^  t  — >  v,  and  Constraint  (5.2)  makes  sure  that  the  chosen  (s,  t)  is  indeed  a 
logical  link  in  EL.  Therefore,  exactly  one  logical  link  (s.t)  can  have  gst  —  1,  and  a 
feasible  solution  to  this  ILP  corresponds  to  a  rerouting  of  the  logical  link. 

In  Constraint  (5.3),  the  two  terms  correspond  to  the  conditions  in  Theorem  5.3. 
The  constraint  makes  sure  that  at  most  one  of  the  conditions  is  satisfied,  thereby 
disallowing  the  non-cuts  of  size  d  —  1  to  be  converted  into  a  cross-layer  cut.  Simi¬ 
larly,  Constraint  (5.4)  makes  sure  A  =  1  for  any  non-cut  c.  G  jVCj  that  is  converted 
into  a  cross-layer  cut  by  the  rerouting. 
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Finally,  Constraints  (5.5)  and  (5.6)  describe  conditions  2)  and  3)  of  Theorem  5.1 
respectively.  Therefore,  if  can  be  1  only  if  both  conditions  in  the  theorem  are  satisfied, 
which  implies  that  cross-layer  cut  c  is  converted  into  a  non-cut. 

Since  the  objective  is  to  maximize  yc  and  minimize  zc,  in  an  optimal  solution 
yc  =  1  if  and  only  if  cross-layer  cut  c  is  converted  into  a  non-cut,  and  2C  =  1  if  and 
only  if  non-cut,  c  is  converted  into  a  cross-layer  cut.  As  a  result,  the  objective  function 
reflects  the  net  reduction  in  the  number  of  MCLCs. 

Note  that  the  variables  yc  and  ,zc  will  take  on  binary  values  in  an  optimal  solu¬ 
tion  even  if  they  are  not  constrained  to  be  integral.  This  observation  significantly 
reduces  the  number  of  binary  variables  in  the  formulation.  There  are  0(\EP\  +  \EL\) 
binary  variables  in  the  rerouting  formulation,  which  is  significantly  less  than  the 
0(\Ep\\EL\)  binary  variables  in  the  Multi-Commodity  Flow  lightpath  routing  formu¬ 
lations  in  Chapter  2.  As  we  will  see  in  the  simulation  section,  this  translates  to  faster 
running  time. 

For  larger  networks,  however,  solving  the  rerouting  ILP  may  still  be  infeasible 
in  practice.  One  way  to  speed  up  the  time  to  solve  the  ILP  is  to  relax  the  binary 
variables  f,  in  the  formulation  and  use  randomized  rounding  discussed  in  Section  2.4.3 
to  construct  a  (u,  c)-path  from  the  optimal  solution  of  the  relaxed  formulation.  In 
the  following  section,  we  describe  a  polynomial  time  (/-approximation  algorithm  for 
the  rerouting  problem.  This  provides  an  alternative  to  apply  rerouting  in  instances 
that  are  too  large  to  solve  the  ILP  optimally.  We  will  evaluate  the  performance  of  all 
these  approaches  in  Section  5.1.4. 

5.1.3  An  Approximation  Algorithm  for  Lightpath  Rerouting 

We  focus  on  the  following  question:  Given  the  lightpath  routing,  and  a  logical  link 
(%/),  what  is  the  best  way  to  reroute  (s,t)  assuming  the  routes  for  all  other  logical 
links  are  fixed?  A  solution  to  this  problem  will  allow  us  to  solve  the  lightpath  rerouting 
problem,  since  we  can  run  the  algorithm  once  for  each  logical  link,  and  return  the 
best  solution. 

Similar  to  the  previous  section,  let  C/,  jfCd  and  J\fCd^i  be  the  set  of  cross-layer 
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cuts  of  size  d,  non-cuts  of  size  d  and  non-cuts  of  size  d  —  1  respectively.  Now  suppose 
Q  is  a  new  physical  route  for  logical  link  |i,  t).  According  to  Theorem  5.2,  a  non-cut 
T  G  NC,i  iJ  Jd~C(i-  i,  will  be  converted  into  a  cross-layer  cut  if  and  only  if  the  following 
is  true: 

1.  (s,  t)  is  critical  to  T. 

2.  Q  uses  any  physical  links  in  T. 

Let  MCf  and  jVCf_ ,  be  the  subsets  of  JVCd  and  MCd-i  that  satisfy  condition 
(1).  These  two  sets  represent  the  non-cuts  that  can  potentially  be  converted  into 
a  cut  by  rerouting  ( s,t ).  It  immediately  follows  that  any  (s,  t.)  path  that  uses  a 
physical  link  in  U teatc**  T  will  create  a  cross-layer  cut  with  size  d  —  1,  which  should 
be  forbidden  for  the  new  physical  route.  In  addition,  for  any  physical  link  the 

set  £AC  -  {T  G  MCsf  :  (i,j)  G  T}  represents  the  non-cuts  with  size  d  that  will  be 
converted  into  cross-layer  cuts  if  the  new  route  Q  for  logical  link  (s,  t)  contains  the 
physical  link 

Similarly,  for  a  cross-layer  cut  S  G  Cd,  it  will  remain  a  cross-layer  cut  after  the 
reroute  if  and  only  if  any  of  the  following  is  true,  according  to  Theorem  5.1: 

1.  S'  is  a  k- way  cut  with  k  >  2. 

2.  s.t  belong  to  the  same  connected  component  in  the  logical  residual  graph  Gf. 

3.  Q  uses  any  physical  link  that  is  contained  in  S. 

Let  Cf  C  Cd  be  the  set  of  cross-layer  cuts  that  satisfy  conditions  (1)  or  (2). 
This  represents  the  set  that  will  continue  to  be  cross-layer  cuts  regardless  of  the  new 
physical  route  Q  for  (s,t).  In  addition,  for  each  (i,j)  G  TV,  the  cross-layer  cuts  in 
the  set  £‘  ,  =  {S  G  Cd  -  Csdl  :  (/.,/)  G  5}  will  also  continue  to  be  cross-layer  cuts  if  the 
new  route  Q  contains  the  physical  link  (i,j). 

Now,  for  each  physical  link  let  Cio  =  U  CJijC ■  If  a  physical  link  (i,  j) 

is  used  by  the  logical  link  (s,  t)  in  the  new  route  Q,  it  will  cause  the  set  Cr]  uQl 
to  become  cross-layer  cuts.  Since  every  set  of  physical  links  in  Cf  will  be  cross¬ 
layer  cuts  regardless  of  the  physical  route  taken  by  {s.t),  the  lightpath  rerouting 


problem  for  logical  link  (,s,  t)  can  be  formulated  as  choosing  the  (s,  f)-path  Q  in  G'P  = 
(Vp,  Ep  —  UT<Ej\fCf*  T)  that  minimizes  |U(j)y)€Q£jJ  |.  Although  this  is  an  instance  of  the 
NP-Hard  Minimum  Color  Path  [124]  problem,  a  simple  d-approximation  algorithm 
exists,  as  described  below: 


Algorithm  4  REROUTE_SP (s,t) _ 

1:  Construct  a  weighted  graph  on  G'P  =  (VP,EP  —  U r&ucf  T),  where  each  edge 
(i,j)  is  assigned  with  weight  w(i,j)  —  \£ij\. 

2:  Run  Dijkstra’s  algorithm  to  find  the  shortest  ( s ,  t)-path  in  the  weighted  graph. 


We  prove  that  REROUTE_SP  is  a  d-approxiination  algorithm. 

Theorem  5.4  Let  Q*  be  the  optimal  physical  route  for  ( s.t. )  that  results  in  the 
minimum  number  of  MCLCs,  and  let  QSP  be  the  new  route  for  (s.t)  returned  by 
REROUTE_SP.  For  any  ( s.t) -path  Q,  let  Nd(Q)  be  the  number  of  cross-layer  cuts 
with  size  d  after  rerouting  (s.t)  with  Q,  where  d  is  the  size  of  the  MCLC.  Then 
Nd(QSP)  <d-  Nd(Q*). 

Proof.  Given  any  (s.t)  path  Q,  define  C(Q)  =  U(.;„  qL,,.  it  follows  that  Nd(Q)  — - 
\C(Q)\  +  \Cf\  —  \C(Q)\  +  K,  where  K  =  \Csdl\  is  a  constant.  In  addition,  let 
w(Q)  be  the  total  weight  sum  of  the  path  Q  in  the  weighted  graph  constructed 
by  REROUTE_SP (s,t). 

Since  each  set  of  physical  links  S  e  C(Q)  has  size  cl,  we  have  |  {(i,j)  :  S  e  £tJ}  |  < 
d,  which  implies: 


w(Q)  =  14,1 

h,i)eQ 

-  ^2  I  {(h.i)  ■  s  e  Cij}  | 
sec{Q) 

<d-\£(Q)\  (5.8) 

=  d  ■  (Nd(Q)  -  K)  (5.9) 
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Now,  since  Qsp  is  the  minimum  weight  ($.  t)  path  in  the  graph,  it  follows  that: 


Nd(Q5P)  =  \C(Qsp)\  +  K 

<  w(QSP)  +  K 

<  w(Q *)  +  K 

<  d  ■  (Nd(Q*)  —  K)  +  K.  by  Equation  (5.9) 

<d.-Nd(Q% 


□ 

Therefore,  the  number  of  cross-layer  cuts  of  size  d  given  by  REROUTE_SP  is 
at  most  d  times  the  optimal  reroute.  Note  that  if  the  optimal  new  route  for  (s,  t) 
eliminates  every  MCLC  of  size  d,  the  approximation  algorithm  will  find  a  new  route 
that  achieves  that  as  well.  We  state  this  observation  as  the  following  corollary. 

Corollary  5.5  REROUTE_SP(.s,  t)  will  return  a  new  route  for  (s,t)  that  increases 
the  size  of  MCLC  of  the  layered  network,  if  such  a  new  route  exists. 

We  can  extend  algorithm  REROUTE_SP,  which  is  based  on  the  Dijkstra’s  shortest 
path  algorithm,  by  using  the  ^-shortest  path  algorithm  [123]  to  successively  compute 
the  next  shortest  path  in  C\,  and  keep  track  of  the  path  Q  with  the  minimum  value 
of  |£(Q)|.  The  value  k  reflects  a  tradeoff  between  running  time  and  quality  of  the 
solution.  As  we  will  see  in  Section  5.1.4,  by  picking  a  good  value  of  k ,  we  can  obtain  a 
lightpath  routing  within  a  much  shorter  time  than  solving  the  ILP  without  sacrificing 
much  in  solution  quality. 

Finally,  the  following  theorem  provides  a  sufficient  condition  for  encountering  the 
optimal  route  for  (s,  t)  during  the  course  of  the  successive  shortest  path  algorithm. 
Specifically,  if  the  successive  shortest  path  algorithm  returns  a  path  with  a  sufficiently 
large  weight,  the  algorithm  can  terminate  right  away. 

Theorem  5.6  Let  Q,  be  the  iih  shortest  path  in  the  weighted  graph  GP,  breaking  ties 

arbitrarily.  Then,  for  any  i  >  1,  if  w(Qi+ 1)  >  d-  min  \L(Qj)\,  then  the  path  QP, 

1  <j<i 
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where  j*  =  argmin|£(Q?)|,  is  an  optimal  route  for  ( s.t ). 

1  <j<i 

Proof.  Let  If  =  min  \C{Q  j)  \  be  the  minimum  value  of  |£(Q)|  among  the  (s,  t)  paths 
Q i,  •  •  • ,  Qi-  Suppose  for  some  i,  we  have  w(Qi+i)  >  d,R -L.  This  implies  all  (s,  t)  paths  Q 
not  in  {Qy ....  Q?;}  have  weight  w(Q)  >  dll,.  By  Equation  (5.8),  \C{Q)\  > 
for  all  such  Q.  This  implies  Qr  is  an  optimal  route.  □ 

5.1.4  Simulation  Results 

In  this  section,  we  present  our  simulation  results  on  the  lightpath  rerouting  approach. 
We  use  the  augmented  NSFNET  (Figure  2-3)  as  the  physical  topology,  and  the  same 
set  of  random  logical  topologies  in  Section  2.5  as  input,  and  run  the  lightpath  rerouting 
algorithms  on  these  instances.  We  will  compare  the  reliability  of  the  lightpath  routings 
produced  by  these  algorithms  with  the  best  known  ILP  lightpath  routing  formulation 
based  on  Multi-Commodity  flow,  presented  in  Section  2.4.2. 

Performance  of  ILP-Based  Rerouting 

We  first  investigate  the  effectiveness  of  the  ILP-based  lightpath  rerouting  approach 
introduced  in  Section  5.1.2  to  improve  cross-layer  reliability.  In  particular,  we  use 
the  best  known  lightpath  routing  algorithm  based  on  multi-commodity  flows,  MCFlf, 
introduced  in  Section  2.4.2,  to  generate  an  initial  set  of  lightpath  routings.  For  each 
lightpath  routing,  we  repeatedly  solve  the  ILP  to  improve  its  reliability,  until  a  local 
optimum  is  reached.  We  evaluate  the  gain  in  reliability  achieved  by  this  rerouting 
approach. 

The  effectiveness  of  the  rerouting  approach  to  improve  reliability  is  compared  with 
an  alternative  approach  based  on  Simulated  Annealing,  which  is  a  general  random 
search  technique  for  optimization  problems.  In  the  Simulated  Annealing  approach, 
the  set  of  possible  lightpath  routings  are  modeled  by  a  set  of  states ,  and  the  transition 
between  two  neighboring  states  represents  a  rerouting  of  a  logical  link.  Each  state 
is  associated  with  a  cost  that  reflects  the  reliability  of  the  lightpath  routing.  In 
particular,  a  lightpath  routing  with  higher  reliability  is  associated  with  a  lower  cost 
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in  its  corresponding  state.  Therefore,  the  state  with  the  lowest  cost  corresponds  to 
the  globally  optimal  lightpath  routing.  The  algorithm  randomly  walks  over  the  state 
space,  with  preference  towards  states  with  lower  cost,  to  search  for  the  state  with  the 
lowest  cost.  Compared  with  the  rerouting  approach  which  stops  at  a  local  optimum, 
the  Simulated  Annealing  approach  avoids  getting  trapped  in  a  local  optimum  by 
allowing  non-zero  probability  of  transitioning  to  neighboring  states  with  higher  costs, 
and  thus  can  find  the  global  optimum  if  the  number  of  iterations  is  sufficiently  large. 
Readers  can  refer  to  [65]  for  details  about  Simulated  Annealing. 

In  this  Simulated  Annealing  experiment,  we  use  the  constant  temperature  function 
T(t)  :=  1,  and  set  the  cost  of  each  lightpath  routing  to  be  Nd  +  100005-d,  where  d 
is  the  Min  Cross  Layer  Cut  value  for  the  lightpath  routing  and  Nd  is  the  number 
of  cross-layer  cuts  with  size  d.  Therefore,  the  cost  of  a  lightpath  routing  is  smaller 
if  it  is  lexicographically  smaller.  The  Simulated  Annealing  algorithm  starts  with 
the  same  set  of  initial  lightpath  routings  generated  by  MCFlf,  and  iterates  until  no 
better  solution  is  found  for  50000  iterations.  The  best  lightpath  routing  encountered 
is  returned  as  the  output. 

Figure  5-4(a)  illustrates  the  average  MCLC  of  the  lightpath  routings  generated 
by  the  rerouting  and  Simulated  Annealing  algorithms.  Both  algorithms  are  able  to 
raise  the  average  MCLC  of  the  initial  lightpath  routings  to  almost  4,  which  is  the 
connectivity  of  the  logical  topologies  and  is  therefore  an  upper  bound  of  the  MCLC 
value.  In  other  words,  in  terms  of  MCLC,  both  algorithms  provide  near-optimal 
performance.  Figure  5-4(b)  illustrates  the  network  failure  probability  of  the  lightpath 
routings  produced  by  the  two  algorithms  in  the  low  probability  regime.  Again,  the 
amount  of  reliability  improvement  achieved  by  both  methods  are  very  close. 

Table  5.1  shows  how  long  it  takes  for  the  two  algorithms  to  reach  their  final  so¬ 
lution,  both  in  terms  of  number  of  iterations  and  running  time.  Simulated  Annealing 
requires  a  much  larger  number  of  iterations  to  converge,  where  each  iteration  requires 
evaluating  the  new  cost,  which  involves  counting  the  number  of  MCLCs  and  is  non¬ 
trivial  to  compute.  This  accounts  for  the  long  running  time  of  Simulated  Annealing. 
On  the  other  hand,  even  though  the  ILP-based  algorithm  solves  an  integer  program 
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Figure  5-4:  Lightpath  rerouting  ILP  vs  Simulated  Annealing.  MCF  is  the  original  algorithm  MCFlf 
trodueed  in  Section  2  4.2.  MCF  -  ILP  is  the  ILP-based  lightpath  rerouting  algorithm.  MCF  -  SA 
is  the  Simulated  Annealing  algorithm. 


in  every  iteration,  the  number  of  iterations  is  much  smaller  and  is  therefore  able  to 
converge  in  a  much  shorter  time. 
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Number  of 
Logical  Nodes 

Number  of  Iterations 

Running  Time  (seconds) 

ILP 

SA 

ILP 

SA 

6 

3.0 

20677 

164 

7622 

7 

4.2 

29559 

257 

11024 

8 

5.0 

32418 

365 

12600 

9 

6.2 

32809 

525 

27738 

10 

7.3 

40591 

824 

15567 

11 

8.0 

34933 

1280 

39325 

12 

8.2 

35471 

1104 

27592 

Table  5.1:  Running  time  of  the  ILP  and  Simulated  Annealing  (SA)  lightpath  rerouting  algorithms. 

Robustness  with  Different  Initial  Lightpath  Routings 

As  discussed  in  Section  5.1,  we  can  repeatedly  apply  lightpath  rerouting  to  any  ini¬ 
tial  lightpath  routing  to  obtain  a  locally  optimal  solution.  Next,  we  investigate  the 
performance  of  rerouting  using  different  initial  lightpath  routings.  We  apply  the  ILP- 
based  rerouting  to  two  sets  of  initial  lightpath  routings  generated  by  two  different 
lightpath  routing  algorithms:  MCFlf  introduced  in  Section  2.4.2  and  Shortest  Path, 
which  routes  each  lightpath  with  minimum  number  of  physical  hops. 

Figures  5-5 (a)  and  5-5(b)  show  the  average  MCLC  and  reliability  values  of  the 
two  sets  of  lightpath  routings  before  and  after  the  repeated  rerouting  steps.  Initially, 
the  lightpath  routings  generated  by  Shortest  Path  have  significantly  lower  MCLC 
and  reliability  than  the  ones  generated  by  MCFlf.  However,  the  lightpath  rerouting 
algorithm  is  able  to  improve  both  sets  of  lightpath  routings  to  similar  MCLC  and 
reliability  values.  This  illustrates  the  robustness  of  the  lightpath  rerouting  approach 
with  respect  to  the  initial  choice  of  lightpath  routing. 

Table  5.2  shows  the  total  number  of  iterations  and  running  time  for  the  lightpath 
rerouting  algorithm  to  reach  the  local  optimum,  starting  with  the  two  different  sets 
of  initial  lightpath  routings.  As  the  lightpath  routings  generated  by  the  shortest  path 
algorithm  generally  have  lower  MCLC  values,  they  require  more  iterations  to  reach 
the  local  optimum  compared  to  the  lightpath  routings  produced  by  MCF|_f.  However, 
the  difference  in  total  running  time  is  less  significant.  This  is  because  the  size  of  the 
rerouting  ILP  formulation  is  larger  when  the  MCLC  of  the  lightpath  routing  is  large, 
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Figure  5-5:  Lightpath  rerouting  with  different  initial  lightpath  routings. 


and  thus  takes  longer  to  solve.  Since  the  lightpath  routings  created  by  the  shortest 
path  algorithm  start  with  a  lower  MCLC  value,  most  of  the  additional  rerouting  steps 
consist  of  solving  the  smaller  ILPs  to  bring  up  the  MCLC  value.  Therefore,  these 
additional  steps  take  much  shorter  time. 
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Number  of 
Logical  Nodes 

Number  of  Iterations 

Running  Time  (seconds) 

MCF 

SP 

MCF 

SP 

6 

3.0 

7.0 

164 

265 

7 

4.2 

8.9 

257 

314 

8 

5.0 

10.3 

365 

500 

9 

6.2 

11.6 

525 

745 

10 

7.3 

14.1 

824 

1238 

11 

8.0 

14.0 

1280 

1389 

12 

8.2 

14.1 

1104 

1268 

Table  5.2:  Running  time  of  iterative  rerouting,  with  different  initial  lightpath  routings.  MCF  cor¬ 
responds  to  initial  lightpath  routings  created  by  MCFlf  and  SP  corresponds  to  initial  lightpath 
routings  created  by  the  shortest  path  algorithm. 


Performance  of  Approximation  Algorithm 

Next,  we  compare  the  performance  of  the  approximation  algorithm  introduced  in  Sec¬ 
tion  5.1.3  with  the  ILP  counterpart.  As  discussed,  the  approximation  algorithm 
is  based  on  the  fc-shortest-path  algorithm,  where  the  parameter  k  reflects  a  trade¬ 
off  between  running  time  and  reliability  performance.  We  evaluate  this  algorithm, 
APPROXk,  with  k  =1,  10  and  100.  In  addition,  we  also  evaluate  the  performance 
of  the  randomized  rounding  algorithm,  RR,  which  solves  the  ILP  REROUTE  with  the 
binary  variables  /,y  relaxed,  and  uses  the  optimal  relaxed  solution  to  construct  the 
physical  route  by  randomized  rounding. 

We  use  the  lightpath  routings  generated  by  the  Shortest  Path  algorithm  as  the 
initial  routings.  Figures  5-6(a)  and  5-6(b)  show  the  reliability  performance  among  the 
algorithms.  While  APPROXi  brings  in  the  majority  of  the  improvement,  increasing 
the  value  of  k  is  able  to  further  improve  the  reliability.  In  particular,  when  k  —  100, 
the  approximation  algorithm  performs  almost  as  well  as  solving  the  ILP.  Similarly,  the 
randomized  rounding  algorithm  also  performs  almost  as  well  as  solving  the  original 
ILP. 

Table  5.3  compares  the  running  time  of  each  algorithm.  As  shown  in  the  ta¬ 
ble,  both  the  approximation  algorithm  and  randomized  rounding  are  at  least  several 
times  faster  than  the  ILP-based  algorithm;  and  the  approximation  algorithm  is  faster 
overall,  potentially  because  it  does  not  involve  solving  any  mathematical  program 
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Figure  o-6.  Lightpath  rerouting:  performance  of  approximation  algorithm 


at  all.  This  result  suggests  that  both  the  approximation  algorithm  and  randomized 
rounding  are  promising  rerouting  approaches  to  improve  the  reliability  of  lightpath 
routings  for  large  networks.  As  we  will  see  in  Section  5.3,  these  algorithms  continue 

to  produce  high  quality  solution  for  networks  that  are  too  large  to  solve  the  original 
ILP  optimally. 
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Number  of 
Logical  Nodes 

Running  Time  (seconds) 

APPROXx 

APPROXjq 

APPROXxoo 

RR 

ILP 

6 

12 

14 

24 

117 

265 

7 

20 

26 

43 

136 

314 

8 

32 

43 

79 

174 

500 

9 

45 

55 

123 

222 

744 

10 

68 

91 

199 

330 

1238 

11 

83 

104 

254 

1389 

12 

113 

135 

344 

465 

1268 

Table  5.3:  Running  times  of  the  ILP,  randomized  rounding  and  approximation  algorithms. 

5.2  Logical  Topology  Augmentation 

The  basic  idea  of  network  augmentation  is  to  add  new  links  to  the  network  in  order  to 
improve  the  reliability  of  the  network.  Although  adding  new  links  should  never  hurt 
reliability,  the  marginal  improvement  in  reliability  may  conceivably  diminish  as  more 
links  are  added  to  the  network.  Thus  there  is  a  tradeoff  between  cost  of  the  new  links 
and  the  reliability  gain  from  them.  In  this  section,  we  will  investigate  the  effectiveness 
of  improving  reliability  of  layered  networks  via  augmentations  to  the  logical  topology. 

A  logical  topology  augmentation,  or  simply  augmentation,  to  a  layered  network  is 
defined  to  be  a  set  of  new  logical  links  to  be  added  to  the  network,  along  with  their 
physical  routes.  The  Single-Link  Logical  Topology  Augmentation  ProblAn  involves 
finding  the  best  way  to  augment  the  logical  topology  with  a  single  logical  link,  in 
order  to  maximize  the  reliability  improvement. 

The  graph  augmentation  problem  has  been  extensive  studied  in  single-layer  net¬ 
works.  Most  of  the  existing  work  [25,43,55,59,119]  focuses  on  the  problem  of  finding 
the  minimum  (weighted  or  unweighted)  set  of  edges  added  to  the  given  graph  in  order 
to  satisfy  a  certain  requirement  (e.g.  connectivity).  Augmenting  a  layered  network 
not  only  involves  deciding  which  logical  edges  to  add,  but  also  the  physical  routes 
to  take.  The  lightpath  routing  aspect  of  the  augmentation  problem  makes  it  a  much 
harder  problem  than  the  single-layer  case. 

For  example,  consider  a  network  with  two  nodes  .s  and  t  connected  by  n  parallel 
edges.  Suppose  we  would  like  to  augment  the  graph  so  that  the  connectivity  increases 
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by  1-  The  solution  in  the  single-layer  setting  would  be  trivial:  simply  add  one  more 
edge  between  the  two  nodes.  However,  in  the  multi-layer  setting,  the  minimum 
number  of  additional  logical  links  required  to  increase  the  MCLC  depends  on  the 
underlying  physical  topology  as  well  as  the  lightpath  routing.  Therefore,  augmenting 
layered  networks  to  improve  reliability  appears  to  be  a  more  challenging  problem. 

In  the  following,  we  will  study  the  single-link  augmentation  problem.  We  first 
give  a  characterization  of  the  problem  in  Section  5.2.1,  and  discuss  its  similarity  with 
the  lightpath  rerouting  problem  studied  in  Section  5.1.  We  next  develop  a  similar 
ILP  formulation  and  approximation  algorithm  in  Sections  5.2.2  and  5.2.3,  and  present 
some  empirical  results  from  a  case  study  of  augmenting  logical  rings  in  Section  5.2.4. 
We  will  look  into  the  structure  of  the  augmentation  problem  in  Section  5.2.5,  and 
derive  a  lower  bound  on  the  minimum  number  of  logical  links  required  to  increase  the 
MCLC  of  the  network.  The  lower  bound  will  be  used  in  Section  5.2.6  to  evaluate  the 
augmentation  algorithm  based  on  repeated  single-link  augmentations. 

5.2.1  Effects  of  a  Single-Link  Augmentation 

Given  a  lightpath  routing  for  the  physical  topology  GP  =  (VP,  ~EP)  and  logical  topol¬ 
ogy  Gl  —  (14,24),  the  Single-Link  Logical  Topology  Augmentation  problem  is  to 
find  two  logical  nodes  sj  £  14,  and  a  (s.t)  path  in  GP ,  such  that  the  reliability 
of  the  network  is  maximized  by  augmenting  the  network  with  the  new  logical  link 
using  the  specified  physical  path.  Similar  to  the  rerouting  problem,  such  a  logical 
link  should  maximize  the  reduction  in  the  number  of  MCLCs.  In  fact,  since  rerouting 
a  logical  link  can  be  considered  as  removing  an  existing  logical  link  from  the  logical 
topology,  and  then  augmenting  the  logical  topology  with  a  new  link  between  the  two 
nodes.  It  is  thus  not  surprising  that  the  characterizations  for  the  single-link  aug¬ 
mentation  problem  is  similar  to  the  lightpath  rerouting  problem.  However,  unlike 
rerouting,  augmenting  the  logical  topology  with  a  new  link  never  converts  a  non-cut 
into  a  cross-layer  cut.  Therefore,  in  augmentation  we  only  need  to  consider  the  effect 
of  the  new  logical  link  on  the  existing  cross-layer  cuts. 

Suppose  that  an  initial  lightpath  routing  is  given  for  the  physical  topology  GP  — 
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( Vp,  EP )  and  the  logical  topology  GL  =  (14,44).  Let  d  be  the  size  of  the  MCLC 
under  the  initial  routing.  Let  Gf  be  the  logical  residual  graph  for  any  cross-layer  cut 
5,  that  is,  the  logical  subgraph  in  which  the  logical  links  do  not  use  any  physical  links 
in  S.  The  following  theorem  characterizes  the  effect  of  a  single-link  augmentation: 

Theorem  5.7  Let  S  be  a  cross-layer  cut  for  a  lightpath  routing.  Augmenting  the 
network  with  a  new  logical  link  ( s ,  t )  over  physical  route  P  converts  a  cross-layer  cut 
S  into  a  non- cut  if  and  only  if: 

1.  S  is  a  2-way  cross-layer  cut. 

2.  s  and  t  are  disconnected  in  the  residual  graph  for  S. 

3.  P  does  not  use  any  physical  links  in  S. 

Proof.  The  proof  is  the  same  as  Theorem  5.1.  The  new  logical  link  will  make  the 
residual  graph  connected  if  and  only  if  the  above  conditions  are  true.  □ 

Note  that  the  conditions  in  Theorem  5.7  are  the  same  as  Theorem  5.1.  Therefore, 
the  algorithms  presented  in  Sections  5.1.2  and  5.1.3  are  mostly  applicable  here. 

5.2.2  ILP  for  Single-Link  Logical  Topology  Augmentation 

The  ILP  for  the  single-link  logical  topology  augmentation  problem  is  similar  to  the 
formulation  in  Section  5.1.2,  and  can  be  interpreted  as  a  path  selection  problem 
on  the  auxiliary  graph  GP  =  ( VP,EP ),  where  VP  =  Vp  U  {u.  v}  and  EP  —  EP  U 
{({/,  s),  (s,  v )  :  Vs  G  14},  as  shown  in  Figure  5-3. 

Let  d  be  the  size  of  the  MCLC  and  Cd  be  the  set  of  2- way  cross-layer  cuts  of  size  d 
in  the  given  lightpath  routing.  We  first  define  the  following  variables  and  parameters: 

1.  Variables: 

•  {(1st  ■  (M)  e  14  x  14}:  1  if  logical  link  (s.t.)  is  added  to  the  network,  and 
0  otherwise. 
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•  {fij  '■  (i-j)  £  Ep}:  Flow  variables  describing  a  path  in  G'P  from  node  u  to 
node  v . 

•  {yc  :  c  e  Cd}-  1  if  the  cross-layer  cut  c  is  converted  into  a  non-cut  by  the 
augmentation,  and  0  otherwise. 

2.  Parameters: 


•  {hcst  '■  £  £  Cd:  (s,t)  G  Ex}:  1  if  logical  nodes  s  and  t  are  disconnected  by 
the  2-way  cut  c,  and  0  otherwise. 

•  { /G  :  Vc  G  Cd,  ( i,j )  G  Ep}:  1  if  physical  link  (i,  j)  is  in  the  set  of  physical 
links  c,  and  0  otherwise. 

The  logical  augmentation  problem  can  then  be  formulated  as  the  following  ILP: 


AUGMENT  :  Maximize  yc,  subject  to: 

cecd 


9st 

< 

{.fas  +  ftv)/ 2, 

V(s,  t)  G  VL  x  VL 

(5.10) 

yc 

< 

Y1  hst9*t 

,  VcGC, 

(5.11) 

(s,t)evLxVL 

yc 

< 

v(* 

■j)  e  Ep,  Vc  e  Cd 

(5.12) 

{(ho) :  f>j 

= 

1}  forms  an  (u, 

u)-path  in  G'P 

(5.13) 

fijidsi  ^  {0, 1}  ,  0  <  yc  <  1 

In  a  feasible  solution  to  the  formulation,  the  variables  fij  represent  a  path  u  — >■ 
s~^t-±v,  as  described  by  Constraint  (5.13).  This  corresponds  to  the  new  logical  link 
to  be  added  to  the  network,  along  with  its  physical  route.  Constraint  (5.10)  ensures 
that  gst  =  I  if  and  only  if  (s,t)  is  the  new  logical  link  selected.  Constraints  (5.11) 
and  (5.12)  describe  the  conditions  in  Theorem  5.7.  The  variable  yc  describes  whether 
the  cross-layer  cut  c  is  converted  into  non-cut  by  the  augmentation.  Therefore,  the 
ILP  maximizes  the  number  of  such  conversions,  which  translates  to  maximizing  the 
improvement  in  reliability. 
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5.2.3  An  Approximation  Algorithm  For  Logical  Topology  Aug¬ 
mentation 

One  can  also  design  an  approximation  algorithm  similar  to  REROUTE_SP  introduced 
in  Section  5.1.3  for  the  single-link  logical  topology  augmentation  problem.  We  will 
again  focus  on  the  following  question:  Given  a  layered  network,  and  a  new  logical  link 
(s,£),  find  the  physical  route  for  (,s,t)  such  that  the  resulting  number  of  cross-layer 
cuts  of  size  d  is  minimized.  We  can  then  apply  the  algorithm  for  this  problem  for 
every  possible  pair  of  logical  nodes  s  and  t,  to  find  out  the  new  logical  link  that  would 
result  in  the  maximum  reliability  improvement. 

Let  d  be  the  size  of  the  MCLC  of  the  layered  network  and  Csf  be  the  set  of  2-way 
cross-layer  cuts  of  size  d  that  separate  the  logical  nodes  s  and  t.  Then  by  Theorem  5.7, 
the  set  Cij  =  {S  G  Cf  :  ( i,j )  G  S}  represents  the  sets  in  Cf  that  will  remain  to 
be  cross-layer  cuts  if  the  physical  link  (i,j)  is  used  by  the  (s.t)  path  Q.  We  can 
then  develop  an  approximation  algorithm  for  the  augmentation  problem  similar  to 
REROUTE_SP: 

Algorithm  5  AUGMENT_SP(.s.  t) 

1:  Construct  a  weighted  graph  on  Gp  —  (Vp,  Ep),  where  each  edge  ( i,j )  is  assigned 
with  weight  w(i,j )  —  \Cij\- 

2:  Run  Dijkstra’s  algorithm  to  find  the  shortest  (s,  t)-path  in  the  weighted  graph. 


Since  each  cross-layer  cut  S  in  Cf  has  size  d,  there  are  exactly  d  physical  links 
(i,j)  such  that  S  G  Cr].  As  a  result,  AUGMENT_SP  is  a  ^-approximation  algorithm, 
with  the  same  proof  as  Theorem  5.4. 

5.2.4  A  Case  Study:  Augmenting  a  Logical  Ring 

In  this  section,  we  consider  augmenting  logical  rings  of  different  sizes  to  study  the 
reliability  improvement  by  the  augmentation  approach.  We  start  with  a  10-node  and 
14-node  logical  rings  on  the  augmented  NFSNET,  as  shown  in  Figure  5-7,  and  run 
the  single  link  augmentation  algorithm  repeatedly. 

The  cross-layer  reliability  of  the  networks  after  each  augmentation  step  is  shown 
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(a)  10  Node  Logical  Ring 


Figure  5-7:  Logical  rings  on  extended  NSFNET. 

in  Figure  5-8.  With  link  failure  probability  p  =  0.01,  the  unreliability  declines  as  we 
add  more  logical  links  to  the  rings.  The  key  observation  from  these  figures  is  that 
the  improvement  in  reliability  is  most  prominent  when  the  augmentation  increases 
the  MCLC  of  the  network.  This  further  validates  our  approach  to  maximize  the 
MCLC  as  the  primary  objective.  In  the  case  where  the  additional  link  does  not  cause 
an  MCLC  increase,  the  marginal  reliability  improvement  decreases  with  the  current 
MCLC  value.  This  means  that  augmentation  is  most  effective  when  MCLC  is  low. 

5.2.5  Minimum  Augmenting  Edge  Set 

Based  on  the  observation  from  the  case  study,  the  Minimum  Augmenting  Edge  Set , 
defined  to  be  the  smallest  set  of  new  logical  links  required  to  increase  the  MCLC  of 
the  layered  network,  is  of  particular  interest.  Clearly,  the  MCLC  value  for  a  layered 
network  is  upper  bounded  by  the  the  logical  connectivity.  Therefore,  given  a  layered 
network  with  MCLC  value  d,  the  number  of  new  logical  links  needed  to  increase  the 


176 


i5 

ro 

n 

B 

a. 

G) 
■ _ 

s 

ll. 


O 


0  01 

0.001 

0.0001 

1e-05 

1  e-06 

le-07 

0  2  4  6  8  10  12  14  16  18  20 

Number  of  Logical  Links  Added 

(a)  10  Node  Logical  Ring 


. 

— i - r 

— 1 - 1 — 

1 - 1 - 1 - 1 - 1- 

. 

■ 

p=0  01  —a— 

* 

- 

SI 

■ 

r 

a 

- 

a 

- 

- 

a 

■ 

. 

MCLC=2 

MCLC=3 

MCLC=4 

. 

*  „ 

7 

» 

“ 

- 

a 

■ 

- 

9 

- 

- 

*  9  a 

a  *  a 

3  a 

_ 1 _ l_ 

_ I _ l _ 

_l _ 1 _ 1 _ 1  -  = . 

>s 

!5 

to 

JO 

e 

Q. 

g> 

s 

u. 

o 


3 

0) 


0  01 

0.001 

0.0001 

1e-05 

1e-06 

le-07 

0  5  10  15  20  25 


'  a 

h  a 


p=o  oi  —  «- 


MCLC=2  MCLC=3 


MCLC=4 


3  »< 


i  a-<a 


Number  of  Logical  Links  Added 
(b)  14  Node  Logical  Ring 

Figure  5-8:  Impact  on  reliability  by  augmenting  logical  rings. 


MCLC  value  is  at  least  the  number  of  edges  required  to  augment  the  logical  topology 
to  connectivity  d+  1.  This  gives  a  simple  lower  bound  on  the  size  of  the  minimum 
augmenting  edge  set. 

In  the  case  of  logical  rings  of  size  n,  this  means  at  least  logical  links  are  required 
to  increase  the  MCLC,  which  happens  to  be  tight  for  the  results  in  Figure  5-8.  In  other 
words,  augmenting  the  network  incrementally  using  the  single-link  augmentation  ILP 
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performs  optimally  in  this  particular  case. 

In  general,  however,  a  logical  topology  with  high  connectivity  can  still  have  low 
MCLC  when  embedded  in  a  physical  network,  and  this  simple  lower  bound  will  not 
be  useful.  In  the  next  section,  we  present  a  method  to  establish  a  tighter  lower  bound. 

Lower  Bound  on  Minimum  Augmenting  Edge  Set 

We  can  develop  a  tighter  lower  bound  on  the  size  of  the  minimum  augmentation 
edge  set  by  taking  the  structure  of  lightpath  routing  into  account.  Suppose  we  are 
given  the  physical  topology  GP  =  (VP,  EP),  logical  topology  GL  =  (VL,  EL)  and  the 
lightpath  routing,  we  start  with  a  few  definitions. 

Definition  5.1  Given  the  lightpath  routing,  a  set  of  logical  links  L  is  covered  by  a 
set  of  physical  links  C  if  all  of  the  links  in  L  use  at  least  a  physical  link  in  C . 

Definition  5.2  A  subset  of  logical  nodes  S  C  VL  is  d-protected  if  and  only  if  the 
logical  cut  set  S(S)  is  not  covered  by  any  set  of  d  physical  links.  In  other  words,  given 
any  d-physical  link  failure,  at  least  one  of  the  logical  links  in  5(S)  survives. 

Definition  5.3  The  d-deficit  A d(S)  for  a  subset  of  logical  nodes  S  C  VL  is  the 
minimum  number  of  new  logical  links  in  ( S ,  VL  -  S)  that  needs  be  added  in  order  to 
make  S  d-protected.  If  S  cannot  be  made  d-protected  (because  the  connectivity  of  the 
physical  topology  is  less  than  d),  A d(S)  is  defined  to  be  oc. 

The  following  theorem  relates  the  d-protectedness  of  the  logical  node  sets  to  the 
MCLC  of  the  layered  network. 

Theorem  5.8  The  MCLC  of  a  layered  network  is  at  least  d  +  1  if  and  only  if  S  is 
d-protected  for  all  S  C  14. 

Proof.  Suppose  there  exists  a  set  of  logical  nodes  S  C  VL  that  is  not  d-protected. 
Then  there  exists  a  set  of  d  physical  links  that  cover  all  logical  links  in  S(S).  As  a 
result,  failure  of  this  set  of  physical  links  will  disconnect  S  from  the  rest  of  the  logical 
topology,  implying  that  the  MCLC  is  at  most  d. 
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On  the  other  hand,  suppose  the  node  set  S  is  d-protected  for  all  S  C  14-  Then 
after  removing  any  d  physical  links  from  the  layered  network,  at  least  one  logical  link 
in  S(S)  survives  for  any  S  C  14 ,  which  implies  that  the  MCLC  is  at  least  d+  1.  □ 

The  next  theorem  provides  the  framework  in  establishing  the  lower  bound  on  the 
size  of  the  minimum  augmenting  edge  set  for  a  lightpath  routing. 

Theorem  5.9  Given  a  layered  network,  let  d  be  the  MCLC  value.  The  minimum 
augmenting  edge  set  for  the  layered  network  is  at  least  \  Y  V7^) ,  for  any  partition 

V[eT 

T  =  [yl, ,  Vf }  of  the  logical  node  set  14 . 

Proof.  Any  augmenting  edge  set  Y  that  increases  the  MCLC  of  the  network  to  d  +  1 
must  make  V[  d-protected,  by  Theorem  5.8.  By  definition  of  A d,  for  all  i,  such  an 
augmenting  edge  set  must  contain  A d(V[)  logical  links  with  one  end  point  in  Vf.  This 

implies  that  Y  must  contain  at  least  \  Y  A d(V[)  logical  links.  □ 

“  vLeT 

Theorem  5.9  suggests  that  we  can  choose  any  partition  of  14  and  establish  a  lower 
bound  by  computing  the  deficit  A d{V[)  f°r  each  component  in  the  partition.  We  will 
discuss  how  the  deficit  can  be  computed  in  Appendix  5.5.  In  the  rest  of  this  section, 
we  will  discuss  how  to  choose  a  good  partition  of  14  to  establish  a  meaningful  lower 
bound. 

Definition  5.4  Two  logical  nodes  x  and  y  are  d-connected  if  they  stay  logically 
connected  to  each  other  under  any  set  of  d  —  1  physical  failures. 

The  following  theorem  shows  that  d- connectedness  is  a  transitive  relation. 

Theorem  5.10  Given  logical  nodes  ,r.  //.  g  in  a  layered  network,  if  x  is  d-connected 
to  y,  and  y  is  d -connected  to  z,  then  x  is  d-connected  to  z. 

Proof.  Suppose  x  is  not  d-connected  to  z.  Then  there  exists  a  set  of  d  -  1  physical 
links  C  whose  removal  will  disconnect  nodes  x  and  z.  Therefore,  the  node  y  will  be 
disconnected  from  either  x  or  z  on  the  removal  of  C,  implying  that  either  f,  y  are  not 
d-connected  or  y.  z  are  not  d-connected,  which  is  a  contradiction.  □ 
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Given  any  partition  T  =  {  V l .... .  }  of  VL,  if  there  exist  x  and  y  that  are  d- 
connected  to  each  other  such  that  they  belong  to  different  components  VL  and  V[, 
then  A d(V[)  =  Xd(V[)  =  0.  As  a  result  A d(V[  U  V[)  >  Xd{V[)  +  A d{V[).  In  other 

words,  the  sum  Xd(\ /[)  in  Theorem  5.9  will  not  decrease  if  the  components  V[ 

Vi  er 

and  V[  are  merged.  This  motivates  the  following  procedure: 


Algorithm  6  MERGE_COMPONENT(,9,  t)  _ 

1:  Create  an  initial  partition  for  VL:  T  :=  j  V[ . . . . ,  VjVr'^  ,  where  each  component 
VXL  contains  a  single  logical  node. 

2:  while  3x  G  V[,  y  G  V[,  i  ^  j,  such  that  x  and  y  are  d-connected,  do: 

Replace  V[,  V[  in  T  by  V[  U  V[. 

3:  Return  T. 


At  the  end  of  the  procedure,  each  component  V[  in  the  partition  T  output  by 
MERGE_COMPONENT  contains  nodes  that  are  d-connected  to  one  another,  and 
nodes  across  different  components  are  not  d- connected.  Therefore,  this  partitioning 
exposes  components  among  which  logical  links  need  to  be  added. 

5.2.6  Simulation  Results 

In  Section  5.2.2,  we  presented  an  ILP  formulation  for  the  single-link  augmentation 
problem  to  maximize  the  reliability  improvement.  One  can  repeatedly  apply  the 
algorithm  to  incrementally  augment  the  network  to  construct  an  augmenting  edge 
set.  In  this  section,  we  will  compare  the  solution  provided  by  this  approach  with  the 
lower  bound  given  in  Section  5.2.5. 

Using  the  augmented  NSFNET  (Figure  2-3)  as  the  physical  topology  and  the 
same  set  of  350  random  logical  topologies  as  in  Section  5.1.4,  we  considered  lightpath 
routings  with  MCLC  values  3,  and  studied  the  number  of  new  logical  links  needed  by 
the  algorithm  to  raise  the  MCLC  values  to  4.  This  number  is  compared  to  the  lower 
bound  given  by  Theorem  5.9.  Note  that  the  simple  lower  bound  introduced  at  the 
beginning  of  Section  5.2.5  based  on  logical  connectivity  would  not  be  helpful  in  this 
case,  since  the  connectivity  of  the  logical  topologies  is  already  4. 
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The  number  of  new  logical  links  needed  by  the  algorithm,  as  well  as  the  lower 
bound  given  by  Theorem  5.9,  are  shown  in  P  igure  5-9.  In  330  of  the  350  instances, 
the  number  of  logical  links  required  by  the  algorithm  is  able  to  meet  the  lower  bound, 
whereas  in  the  other  20  instances  the  number  is  one  larger  than  the  lower  bound. 
This  suggests  that  the  incremental  augmentation  approach  is  able  to  come  up  with 
an  optimal  or  near-optimal  augmenting  edge  set  in  each  case.  In  addition,  the  result 
shows  that  Theorem  5.9  gives  us  a  good  lower  bound  that  can  be  used  for  evaluating 
augmentation  algorithms. 
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Figure  5-9:  Size  of  augmenting  edge  set  generated  by  incremental  single-link  augmentation  vs  lower 
bound. 


Finally,  we  study  the  marginal  benefit  of  augmenting  the  logical  topology,  us¬ 
ing  the  lightpaths  routings  produced  by  the  rerouting  method  in  Section  5.1.4  as 
the  baseline.  Figure  5-10  shows  the  improvement  in  reliability  by  augmenting  the 
network  with  different  number  of  logical  links.  As  the  starting  lightpath  routings 
already  achieve  the  maximum  possible  MCLC  value,  the  improvement  shown  in  the 
figure  is  due  to  the  reduction  in  the  number  of  MCLCs.  Even  though  the  marginal 
improvement  in  reliability  diminishes  with  more  logical  links  added  to  the  network, 
overall,  the  reliability  of  the  network  can  be  further  improved  by  augmentation. 
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Figure  5-10:  Improving  reliability  via  augmentation. 

5.3  Case  Study:  A  Real-World  IP-Over- WDM  Net¬ 
work 

Most  of  the  simulations  presented  in  this  thesis  are  on  the  14-node  augmented  NSFNET 
as  the  physical  topology.  In  this  section,  we  will  study  the  performance  of  various 
algorithms  on  a  large  layered  network  based  on  a  real-world  IP-over-WDM  network. 
The  physical  and  logical  topologies,  shown  in  Figure  5-11,  are  constructed  based  on 
the  network  maps  available  from  Qwest  Communications  [1], 

The  study  on  networks  of  larger  size  allows  us  to  reevaluate  the  performance  of 
the  lightpath  algorithms,  both  in  terms  of  scalability  and  solution  quality.  In  this 
study,  we  have  attempted  to  run  the  various  lightpath  routing  algorithms  introduced 
throughout  the  thesis,  including: 

1.  SURVIVE:  The  existing  survivable  lightpath  routing  algorithm  introduced  in  [76], 
used  as  the  benchmark  for  comparing  with  the  algorithms  introduced  in  this 
thesis.  The  lightpath  routing  is  computed  using  randomized  rounding  (Sec¬ 
tion  2.4.3)  on  the  optimal  solution  of  the  linear  relaxation. 

2.  MCFMinCut:  The  simple  multi-commodity  flow  formulation  introduced  in  Sec- 
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(b)  IP/MPLS  (logical)  network.  The  numbers  indicate  the  number  of  parallel  logical 
links  between  the  logical  nodes. 


Figure  5-11:  Physical  and  logical  topologies. 


tion  2.4.2.  The  lightpath  routing  is  computed  using  randomized  rounding  on 
the  optimal  solution  of  the  linear  relaxation. 

3.  MCFlf:  The  enhanced  multi-commodity  flow  formulation  introduced  in  Sec¬ 
tion  2.4.2,  where  each  constraints  captures  the  impact  of  a  fiber  failure  on  each 
logical  cut.  The  lightpath  routing  is  computed  using  randomized  rounding  on 
the  optimal  solution  of  the  linear  relaxation. 

4.  REROUTEilp:  The  iterative  lightpath  rerouting  algorithm,  based  on  the  ILP 
presented  in  Section  5.1.2. 

5.  REROUTErr:  The  iterative  lightpath  rerouting  algorithm,  based  on  the  ILP 
presented  in  Section  5.1.2,  with  the  variables  ftJ  relaxed.  The  physical  route 
is  obtained  by  choosing  the  best  solution  out  of  1000  iterations  of  randomized 
rounding  on  the  optimal  fractional  solution  to  fiv 

6.  REROUTEApprox:  The  iterative  lightpath  rerouting  algorithm,  based  on  the  k- 
shortest  path  algorithm  presented  in  Section  5.1.3,  where  k  is  set  to  5000  in  our 
experiment. 

7.  AUGMENT|LP:  The  logical  topology  augmentation  algorithm,  based  on  the  ILP 
presented  in  Section  5.2.2. 

8.  AUGMENTApprox:  The  logical  topology  augmentation  algorithm,  based  on  the 
^-shortest  path  algorithm  presented  in  Section  5.2.3,  where  k  is  set  to  5000  in 
our  experiment. 

Table  5.4  summarizes  the  results  of  the  lightpath  routing  algorithms.  In  general, 
algorithms  that  solve  ILPs  (such  as  REROUTE|LP,  and  AUGMENT|LP)  or  large  linear 
programs  (such  as  MCFlf)  are  no  longer  feasible,  due  to  the  large  memory  requirement 
of  the  ILP  and  LP  solvers.  This  limitation  of  ILP-based  solution  justifies  the  design 
of  more  scalable  methods,  such  as  the  randomized  rounding  algorithm  REROUTErr; 
as  well  as  the  approximation  algorithms  REROUTEApprox  and  AUGMENTApprox.  The 
approximation  algorithms,  which  are  based  on  the  successive  shortest  path  algorithm, 
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run  in  polynomial  time  and  require  a  much  smaller  memory  footprint  than  solving 
the  ILP,  and  are  therefore  able  to  finish  successfully  for  networks  of  this  scale. 


Algorithm 

Terminates  Successfully  ? 

SURVIVE 

Yes 

MCFMinCut 

Yes 

MCFlf 

No 

REROUTEilp 

No 

REROUTErr 

Yes 

REROUTEApprox 

Yes 

AUGMENT,^ 

No 

AUGMENTApprox 

Yes 

Table  5.4:  Scalability  comparisons  among  different  lightpath  routing  algorithms. 

We  next  compare  the  quality  of  the  lightpath  routings  produced  by  the  algorithms 
SURVIVE,  MCFMinCut,  REROUTErr,  REROUTEApprox  and  AUGMENTApprox  (with  dif¬ 
ferent  number  of  new  logical  links).  The  MCLC  values  and  the  number  of  MCLCs 
of  the  lightpath  routings  generated  by  each  algorithm  are  shown  in  Table  5.5.  These 
numbers  are  compared  against  the  lower  bound,  which  is  computed  by  counting  the 
number  of  minimum  sized  physical  fiber  sets  whose  removal  will  physically  disconnect 
some  logical  nodes.  These  sets  of  hysical  links  are  cross-layer  cuts  regardless  of  the 
lightpath  routing,  and  therefore  will  provide  a  lower  bound  on  the  number  of  MCLCs. 

It  was  observed  in  Section  2.5  that  the  survivability  performance  of  the  multi- 
commodity  flow  formulation  MCFMincut  declines  as  the  network  size  increases.  In 
this  case,  the  MCLC  value  of  the  lightpath  routing  produced  by  MCFMmCut  is  no 
better  than  SURVIVE,  although  by  spreading  the  logical  links  over  different  physical 
fibers,  the  algorithm  manages  to  reduce  the  number  of  logical  cuts  that  are  covered 
by  a  2-fiber  failure.  On  the  other  hand,  the  rerouting  algorithms  REROUTErr  and 
REROUTEApprox  continue  to  be  able  to  improve  the  MCLC  to  the  maximum  possible 
value  of  4  (limited  by  the  physical  connectivity).  Augmenting  the  logical  topology 
can  further  improve  the  reliability  of  the  layered  network  by  reducing  the  number  of 
MCLCs,  though  the  incremental  effect  declines  as  more  logical  links  are  added  to  the 
network.  The  number  of  MCLCs  hits  the  lower  bound  when  the  logical  topology  is 
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augmented  with  9  additional  logical  links. 

Figure  5-12  compares  the  algorithms  in  terms  of  the  cross-layer  reliability  in 
the  low  failure  probability  regime.  Consistent  with  Table  5.5,  the  iterative  algo¬ 
rithms  presented  in  this  chapter  achieve  significantly  higher  reliability  than  the  joint 
lightpath  routing  algorithms.  In  particular,  the  majority  of  the  improvement  is 
achieved  by  the  lightpath  rerouting  approach,  especially  by  the  approximation  al¬ 
gorithm  REROUTEApProx-  Therefore,  even  if  adding  new  logical  links  is  not  an  option, 
the  lightpath  rerouting  method  allows  us  to  obtain  a  lightpath  routing  that  is  close 
to  optimal.  In  summary,  the  approximation  algorithms  introduced  in  Sections  5.1.3 
and  5.2.3  provide  a  good  tradeoff  between  scalability  and  solution  quality. 


Algorithm 

MCLC 

Number  of  MCLCs 

SURVIVE 

2 

26 

MCFMinCut 

2 

5 

REROUTErr 

4 

458 

REROUTEApprox 

4 

216 

AUGMENTApprox  ! 

4 

84 

AUGMENTApprox_2 

4 

49 

AUG  M  ENTApprox„3 

4 

34 

AUGM  EI\ITApprox_4 

4 

29 

AUGMENTApprox_5 

4 

25 

AUGMEI\ITApprox_6 

4 

23 

AUGMENTApprox_7 

4 

22 

AUGMENTApprox_8 

4 

21 

AUGMENTApprox„9 

4 

20 

Lower  Bound 

4 

20 

Table  5.5:  MCLC  values  and  MCLC  counts  of  different  lightpath  routings.  The  lightpath  routing 
on  a  logical  topology  augmented  with  k  new  logical  links  is  denoted  by  AUGMENTAppr0x  k. 


5.4  Conclusion 

In  this  chapter,  we  propose  two  methods  to  improve  the  reliability  of  a  layered  network 
in  the  low  failure  probability  regime.  The  main  idea  behind  these  methods  is  to 
maximize  the  size  of  the  MCLC,  as  well  as  minimize  the  number  of  MCLCs  via 
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Figure  5-12:  Unreliability  of  different  lightpath  routings. 


iterative  local  changes  to  the  layered  network.  In  the  lightpath  rerouting  method, 
each  iterative  step  involves  replacing  the  physical  route  of  an  existing  logical  link 
by  a  new  route  that  results  in  a  smaller  number  of  MCLCs.  In  the  logical  topology 
augmentation  method,  each  iteration  augments  logical  topology  with  a  new  link  that 
eliminates  the  maximum  number  of  MCLCs.  By  applying  the  methods  iteratively  to 
a  layered  network,  we  can  obtain  a  locally  optimal  lightpath  routing  in  the  low  failure 
probability  regime. 

For  both  the  rerouting  and  augmentation  problems,  we  develop  an  ILP,  as  well 
as  a  polynomial  time  approximation  algorithm,  to  compute  a  (near-)optimal  solution 
in  each  iteration.  Simulation  results  show  that  through  such  iterative  incremental 
improvements,  we  can  obtain  a  lightpath  routing  with  significantly  higher  reliability 
than  any  existing  lightpath  routing  algorithms,  including  the  algorithms  introduced 
in  Chapter  2. 

The  iterative  approach  introduced  in  this  chapter  is  also  more  scalable  in  general 
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compared  with  the  conventional  lightpath  routing  algorithms,  which  compute  the 
physical  route  for  all  logical  links  jointly.  By  considering  only  local  changes  one  logical 
link  at  a  time,  the  reliability  optimization  problem  is  broken  down  into  smaller  and 
manageable  subproblems,  which  can  then  be  efficiently  solved  by  the  approximation 
algorithm.  This  provides  a  viable  approach  to  the  design  of  reliable  layered  networks 
of  large  scale  in  the  real  world. 

5.5  Chapter  Appendix:  Computing  Deficit  of  a  Log¬ 
ical  Node  Set 

In  Section  5.2.5,  we  define  the  d-deficit  A d(S)  of  a  logical  nodes  set  S  to  be  the 
minimum  number  of  logical  links  that  need  to  be  added  to  make  S  d-protected,  given 
a  layered  network  with  MCLC  d.  In  this  section,  we  discuss  how  this  value  can  be 
computed. 

First  note  that  sometimes  it  is  impossible  to  make  the  node  set  S  d-protected.  For 
example,  if  there  are  only  d  physical  fibers  that  connect  S  to  other  physical  nodes, 
the  failure  of  these  d  links  will  disconnect  all  logical  links  that  connect  S  to  VL  -  S. 
In  that  case,  A d(S)  is  defined  to  be  oo.  In  the  rest  of  the  section,  we  assume  that  the 
physical  topology  is  d  +  1  connected,  so  that  it  is  possible  to  make  the  node  set  S 
d-  protected. 

We  will  present  an  ILP  that  computes  the  smallest  set  of  new  logical  links  to  make 
the  node  set  S  d-protected.  The  ILP  relies  on  the  following  result: 

Theorem  5.11  A d(S)  <  d  +  1. 

Proof.  Pick  x  £  S  and  y  eVL-  S.  Since  the  physical  topology  is  (d  +  l)-connected, 
there  exists  d  +  1  physically  disjoint  paths  between  x  and  y.  Therefore,  if  we  add 
d  +  1  copies  of  new  logical  links  (x.y),  each  taking  on  one  of  the  physically  disjoint 
paths,  at  least  one  of  the  links  would  survive  against  any  d-fiber  failure.  Therefore, 
Xd(S)<d+l.  □ 
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As  a  result  of  the  theorem,  we  can  formulate  an  ILP  to  select  up  to  d  +  1  paths 
between  S  and  Vp  —  5,  such  that  for  any  cross-layer  cut  C  of  size  d,  at  least  one  of 
the  paths  do  not  use  any  fibers  in  C.  Given  the  physical  topology  Gp  =  ( Vp,Ep ), 
we  construct  an  auxiliary  graph  G'p  —  (VP,  EP ),  where  VP  =  Vp  U  {u,v}  and  EP  = 
EpG{(u,x)  :  x  e  5'}U{(x,n)  :  x  6  Vp  —  S'},  as  shown  in  Figure  5-13.  In  the  auxiliary 
graph,  a  new  source  node  u  and  a  sink  node  v  are  added,  and  the  source  node  is 
connected  to  all  nodes  in  S,  and  the  sink  node  v  is  connected  to  all  nodes  in  Vp  —  S. 
As  a  result,  any  (u,  v )  path  in  GP  corresponds  to  a  logical  link  from  S  to  Vp  —  S  as 
well  as  its  physical  route. 


Figure  5-13:  Auxiliary  graph  GP  for  the  ILP.  Nodes  u  and  v  are  the  new  source  and  sink  nodes, 
and  the  dashed  lines  are  the  new  edges. 

We  first  define  the  following  variables  and  parameters  for  the  ILP. 

1.  Parameters: 

•  S:  the  logical  node  set  for  which  A ,{{S)  is  to  be  computed. 

•  Cf:  cross-layer  cuts  of  size  d  that  cover  all  logical  links  in  6(S,  Vp  -  S)  in 
the  original  logical  topology  Gp. 

•  {/?■  :  Vc  €  Cf,  (i.j)  €£’/>}:  1  if  physical  link  (i.j)  is  in  fiber  set  c,  and  0 
otherwise. 
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2.  Variables: 


•  (V  =  M  €  Ep,  l<P<d+l}:  Flow  variables  describing  the  kth  path 
in  G'p  from  node  u  to  node  v. 

•  {yl  :  c  E  Cf,  1  <  />'  <  d  •  1 } :  1  if  the  kth  path  uses  any  fiber  in  cross-layer 
cut  c,  and  0  otherwise. 

The  deficit  of  the  node  set  S  can  be  computed  by  the  formulation  below: 

Minimize  p,  subject  to: 

P=  X  X-^ 

l<k<d-\-l  x€S 

Uk  V  Kj.lp-  V(i,  j)  E  Ep,  cECd,l<k<d  +  1 

X  <  P  -  1  •  VcECf 

l<fc<d+l 

{(*,  j)  :  fij  =  1}  is  all  0,  or  forms  an  (//.  ?:)-path  in  (?',  VI 
ffj  4{0;1},0  <yl<  1 

The  formulation  selects  up  to  d  +  1  paths  from  u  to  v.  Each  path  represents  a 
new  logical  link  that  will  be  added  to  the  logical  topology.  Constraint  (5.14)  counts 
the  number  of  new  logical  links  selected.  The  variable  y‘k  indicates  whether  the  kth 
logical  link,  if  selected,  will  be  disconnected  by  cross-layer  cut  c,  by  Constraint  (5.15). 
Constraint  (5.16)  then  ensures  that  for  any  cross-layer  cut  c,  at  least  one  of  the  new 
logical  links  will  survive  its  failure.  As  a  result,  the  solution  given  by  the  formulation 
will  make  the  node  set  S  d-protected,  and  the  optimal  value  equals  the  value  of  A d(S). 


(5.14) 

(5.15) 

(5.16) 

<  k  <  d  +  1 
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Chapter  6 


Conclusion  and  Future  Work 


In  this  thesis,  we  consider  a  layered  network  model  where  the  upper-layer  logical 
links  share  the  lower-layer  physical  fibers  via  lightpath  routing.  As  such,  a  single 
physical  failure  will  cause  multiple  logical  links  to  fail  in  a  correlated  manner.  This 
phenomenon  introduces  new  challenges  in  defining,  measuring  and  optimizing  surviv¬ 
ability  in  the  layered  setting.  This  thesis  investigates  the  new  issues  that  arise  under 
this  model,  in  an  attempt  to  develop  useful  insights  in  survivable  layered  network 
design. 

We  start  with  an  investigation  of  the  fundamental  properties  of  layered  networks, 
and  show  that  basic  connectivity  structures,  such  as  cuts,  disjoint  paths  and  spanning 
trees,  exhibit  fundamentally  different  characteristics  from  their  single-layer  counter¬ 
parts.  This  necessitates  the  pursuit  of  new  survivability  metrics  that  properly  quantify 
the  resilience  of  the  network  against  physical  failures.  To  this  end,  we  define  a  new 
metric,  the  Min  Cross  Laver  Cut  (MCLC),  to  be  our  primary  cross-layer  metric  and 
develop  algorithms  to  design  layered  networks  with  high  MCLC  values. 

We  next  extend  our  study  to  a  setting  where  physical  link  failures  are  modelled 
as  random  events.  Under  this  model,  we  study  the  cross-layer  reliability  of  layered 
networks,  defined  to  be  the  probability  that  the  logical  topology  stays  connected  under 
the  random  physical  failures.  The  key  to  this  study  is  the  failure  polynomial,  which 
expresses  the  cross-layer  reliability  of  the  network  as  a  polynomial  in  the  physical  link 
failure  probability.  The  coefficients  of  the  polynomial  contain  important  structural 
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information  about  the  layered  network.  By  exploiting  the  structures  of  cross-layer 
cuts  in  a  layered  network,  we  develop  an  efficient  algorithm  to  estimate  the  cross¬ 
layer  reliability. 

Through  the  study  of  the  failure  polynomial,  we  also  develop  important  insight 
into  the  connection  between  the  link  failure  probability,  the  cross-layer  reliability  and 
the  structure  of  a  layered  network.  For  the  cases  where  the  link  failure  probability  is 
sufficiently  low  or  sufficiently  high,  we  have  characterized  the  optimality  conditions 
for  lightpath  routings,  and  developed  bounds  on  the  failure  probability  regimes  where 
these  conditions  apply.  This  result  also  leads  to  a  non-trivial  sufficient  condition  for 
uniformly  optimal  lightpath  routing. 

Based  on  these  insights,  we  develop  new  algorithms  to  design  layered  networks 
that  are  optimized  for  the  low  failure  probability  regime.  Based  on  the  ideas  of 
iterative  rerouting  and  augmentation,  these  algorithms  are  able  to  achieve  locally 
optimal  solutions.  Our  simulation  results  show  that  lightpath  routings  produced  by 
these  methods  are  significantly  more  reliable  than  the  lightpath  routings  produced  by 
existing  algorithms,  and  are  more  scalable  to  large  networks. 

Throughout  the  thesis,  we  have  considered  the  connectedness  of  the  logical  topol¬ 
ogy  as  the  survivability  requirement,  and  defined  metrics,  such  as  MCLC,  based 
on  this.  One  natural  extension  to  our  study  is  to  consider  different  survivability 
requirements.  For  example,  the  ability  to  support  protected  traffic  is  an  important 
requirement  for  many  applications.  This  requires  setting  up  primary  and  backup  con¬ 
nections  that  are  physically  disjoint.  As  discussed  in  Chapter  2,  a  network  with  high 
MCLC  value  does  not  guarantee  the  existence  of  physically  disjoint  paths.  Therefore, 
metrics  based  on  maximum  cross-layer  disjoint  paths  or  minimum  survivable  path  set 
(defined  in  Section  2.2)  may  be  more  appropriate  in  this  setting.  Lightpath  routings 
that  are  optimized  for  these  metrics  may  potentially  have  different  structures  from 
the  ones  observed  in  this  thesis. 

Another  possible  future  direction  is  to  extend  the  current  network  model  to  a 
capacitated  setting.  Even  though  two  different  lightpath  routings  may  tolerate  the 
same  number  of  physical  failures  from  the  connectivity  standpoint,  the  impact  of 
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such  failures  on  the  capacity  of  the  logical  topology  can  be  different.  Therefore, 
an  interesting  problem  is  to  design  lightpath  routing  algorithms  that  also  take  the 
network  capacity  and  client  traffic  pattern  into  account.  For  example,  in  [60],  an  ILP 
is  developed  to  compute  lightpath  routings  that  allow  the  logical  network  to  support 
a  given  traffic  matrix  under  single  link  failures.  It  would  be  interesting  to  study  how 
to  extend  the  result  in  the  context  of  multiple  failures. 

Finally,  this  thesis  focuses  on  the  design  of  lightpath  routing  that  maximizes 
survivability,  assuming  the  physical  and  logical  topologies  are  given.  Conceivably,  a 
careful  choice  of  the  physical  and  logical  topologies  will  make  this  lightpath  routing 
problem  easier.  Therefore,  the  design  of  physical  and  logical  topologies  is  an  equally 
important  problem.  Conjecture  1  in  Section  4.4.1,  which  describes  a  special  condition 
for  the  existence  of  uniformly  optimal  lightpath  routing,  would  be  a  good  starting 
point  to  attack  this  problem  area. 
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